Mathematical methods for economic theory

Martin J. Osborne

1.5 Calculus: one variable

Differentiation

Let f be a function of a single variable defined on an open interval. This function is differentiable at the point a if it has a well-defined tangent at a. Its derivative at a, denoted f'(a), is the slope of this tangent.

Precisely, consider “secant lines” like the one from (af(a)) to (a + hf(a + h)) in the following figure.

a a + h tangent slope =f'(a) secant line f(x) f(a+h) f(a)

Such a line has slope (f(a + h) − f(a))/h. The derivative of f at a is defined to be the limit, if it exists, of this slope as h decreases to zero.

Definition
The function f of a single variable defined on an open interval I is differentiable at the point a in I if
limh→0 
f(a + h) − f(a)
h
exists, in which case this limit is the derivative of the function f at a, denoted f'(a). If f is differentiable at every point in I then it is differentiable on I and the function f' is its derivative.
Note that since a is in I and I is open, for h sufficiently small the point a + h is in I, so that f(a + h) is defined. The statement “limh→0(f(a + h) − f(a))/h exists” means, precisely, that there is a number k such that for every number ε > 0 (no matter how small), we can find a number δ > 0 such that if |h| < δ then the difference between f(a + h) − f(a))/h and k is less than ε.

The graph of a function differentiable at a is “smooth” at a. In particular, a function that is differentiable at a is definitely continuous at a.

Proposition 1.5.1
If the function f of a single variable defined on an open interval I is differentiable at the point a in I then it is continuous at a.
Proof  
We need to show that limh→0 f(a + h) − f(a) = 0. Write f(a + h) − f(a) as (f(a + h) − f(a))/h · h. Then limh→0 f(a + h) − f(a) = limh→0 (f(a + h) − f(a))/h · limh→0 h = f'(a) · 0 = 0.
An example of a function that is not differentiable at some point is shown in the following figure.

a f(x) f(a)

The function f in the figure is not differentiable at a, because the slope of the secant line from (af(a)) to (a + h, f(a + h)) is very different for h > 0 and for h < 0, even when h is arbitrarily small, as shown in the following two figures. Thus limh→0(f(a + h) − f(a))/h does not exist: there is no number k such that for all h close enough to 0 the slope a secant line through (af(a)) and (a + hf(a + h)) is close to k.

a a + h f(x) f(a) secant line (h > 0)    a a + h f(x) f(a) secant line (h < 0)

At any point at which a function has such a “kink”, the function is not differentiable.

The derivative of f at a is sometimes denoted (df/dx)(a) rather than f'(a) (although the latter is more compact, more elegant, and more precise).

Rules for differentiation

The definition of a derivative implies the following formulas for the derivative of specific functions, where a and k are constants.

f(x)    f'(x)

k    0
kxn    knxn−1
ln x    1/x
ex    ex
ax    axln a
cos x    −sin x
sin x    cos x
tan x    1 + (tan x)2

Here are three (very important!) general rules.

Sum rule
If F(x) = f(x) + g(x) then F'(x) = f'(x) + g'(x)
Product rule
If F(x) = f(x)g(x) then F'(x) = f'(x)g(x) + f(x)g'(x)
Quotient rule
If F(x) = f(x)/g(x) then F'(x) = [f'(x)g(x) − f(x)g'(x)]/(g(x))2.
Note that if you know that the derivative of (g(x))n is n(g(x))n−1g'(x) (an implication of the “chain rule”, discussed later), the quotient rule follows directly from the product rule: if you write f(x)/g(x) as f(x)(g(x))−1 then the product rule implies that the derivative is
f'(x)(g(x))−1 − f(x)(g(x))−2g'(x),
which is equal to [f'(x)g(x) − f(x)g'(x)]/(g(x))2.
Example 1.5.1
Let F(x) = x2 + ln x. By the sum rule, F'(x) = 2x + 1/x.
Example 1.5.2
Let F(x) = x2ln x. By the product rule, F'(x) = 2xln x + x2/x = 2xln x + x.
Example 1.5.3
Let F(x) = x2/ln x. By the quotient rule, F'(x) = [2xln x − x2/x]/(ln x)2 = [2xln x − x]/(ln x)2.

Second derivatives

If the function f is differentiable at every point in some open interval I then its derivative f' may itself be differentiable at points in this interval.
Definition
Let f be a function of a single variable defined on an open interval I. If the derivative f' of f is differentiable then we say f is twice-differentiable and call the derivative of f', which we denote f", the second derivative of f.
Here's a simple example of a function that is differentiable, with a continuous derivative, but is not everywhere twice-differentiable.
Example 1.5.4
Let f(x) = x2 for x ≤ 0 and f(x) = 0 for x > 0. Then f is differentiable, with f'(x) = 2x for x ≤ 0 and f'(x) = 0 for x > 0. The derivative f' is continuous, but has a kink at 0, so f' is not differentiable at 0, and thus f is not twice-differentiable at 0.

Integration

Let f be a function of a single variable on the domain [ab]. The (“definite”) integral of f from a to b, denoted
b
a
f(x)dx,
is defined to be the area between the horizontal axis and the graph of f, between a and b. In any regions where f(x) < 0, the corresponding areas, like the one shaded pink in the figure, count negatively in the integral. Thus the integral from a to b of the function f in the figure is the sum of the blue areas minus the pink area.

a b x f(x) 0

How exactly can we define the area under the graph of a function? Let's approximate the area with a shape whose area we know how to calculate. First choose some points x1, x2, ..., xn in [ab] with a = x1 < x2 < ... < xn = b. We're going to refer to such collections of points repeatedly, so let's give them a name.

Definition
A partition of the interval [ab] is a list (x1, x2, ..., xn) of numbers with a = x1 < x2 < ... < xn = b.
Now construct the (discontinuous) function m that is constant on every interval [xixi+1] for i = 1, ..., n−1 and is the largest such function that lies nowhere above f. For the partition (x1, ..., x8) (with x1 = a and x8 = b) in the following figure, this function is indicated in green.

a x2 x3 x4 x5 x6 x7 b f(x) m(x) 0

The area under the graph of m consists of a collection of rectangles, whose areas we know: it is n−1
i=1
mi(xi+1 − xi)
, where mi is the value of m(x) for x between xi and xi+1. (The areas shaded light green in the figure count negatively, because mi < 0 for them.) We give a name to this area: the “lower sum of f for the partition”. It is defined precisely as follows.

Definition
Let f be a function of a single variable defined on the interval [ab] and let (x1, ..., xn) be a partition of [ab]. For i = 1, ..., n−1 let mi be the largest number such that f(x) ≥ mi for all x in the interval [xixi+1]. The lower sum of f for (x1, ..., xn) is n−1
i=1
mi(xi+1 − xi)
.
Why does the definition say that mi is the largest number such that f(x) ≥ mi for all x in the interval [xixi+1], rather than simply saying that mi is the smallest value of f(x) for x in the interval? Because f(x) might not have a smallest value for x in the interval. Consider, for example, the function f in the following figure. This function is discontinuous at the point a. For values of x less than a and close to a, the value of the function is close to m. But for no value of x is f(x) equal to m; in particular, f(a) = y. Thus f(x) has no minimum on the interval [x1x2]. However, the number m is the largest number such that f(x) ≥ m for all x in the interval.

x1 a x2 f(x) y m

For the partition (x1, ..., x8) shown in the figure, the function m is not a very good approximation for the function f; the area under the graph of m is not very close to the area under the graph of f. But if we make the partition finer, the difference between the two areas diminishes, as the next figure illustrates.

a b f(x) m(x) 0

Before we think about the limit of the areas as we make the partitions finer, consider another approximation to the area under the graph of f. The function m we have constructed lies everywhere below or on the function. We can alternatively approximate f by a function M that lies everywhere above or on the function, as illustrated in the next figure.

a b f(x) M(x) 0

The analogue of the lower sum for M is called the “upper sum”.

Definition
Let f be a function of a single variable defined on the interval [ab] and let (x1, ..., xn) be a partition of [ab]. For i = 1, ..., n−1, let Mi be the smallest number such that f(x) ≤ Mi for all x in the interval [xixi+1]. The upper sum of f for (x1, ..., xn) is i=1n−1Mi(xi+1 − xi).
Now suppose that as we make the partition increasingly fine, both the lower and upper sums converge to the same number. Then that number seems to be a sensible value for the area under the graph of f, and indeed it's the value we assign to the integral of f between a and b.
Definition
Let f be a function of a single variable defined on the interval [ab]. Let M be the smallest number such that the lower sum of f for P is at most M for every partition P of [ab] and let m be the largest number such that the upper sum of f for P is at least m for every partition P of [ab]. Then f is integrable if M = m, and the common value is the definite integral of f from a to b, denoted
b
a
 f(xdx.
Every continuous function is integrable, as the following result asserts.
Source  hide
For proofs, see Spivak (1980), Theorem 3 on p. 246 and Rudin (1976), Theorem 6.8 on p. 125 (a more general result).
Many functions that are not continuous are integrable. In fact, functions that are not integrable are fairly exotic. An example is the function f with domain [0, 1] defined by f(x) = 1 if x is a rational number and f(x) = 0 if x is an irrational number. For this function the lower sum for any partition is 0 and the upper sum for any partition is 1.

Note that the variable x in the expression for a definite integral is a dummy variable, and can be replaced by any other variable. Sometimes it is dropped entirely, and the integral is written simply as ∫b
a
f.

Fundamental theorem of calculus

Let f be an integrable function of a single variable defined on the interval [ab]. Define the function F on [ab] by F(x) = ∫x
a
f(z)dz. How does the value of F change as x increases? An increase in x by a small amount ε adds an area almost equal to a narrow rectangle of height f(x) to the area under f. Thus the rate of increase in the area, for ε small, is close to f(x). This argument suggests that the derivative of F at x is f(x). The following result states this conclusion precisely.
Proposition 1.5.3 (Fundamental theorem of calculus)  source
Let f be an integrable function of a single variable defined on the interval [ab]. Define the function F of a single variable on [ab] by
F(x) = ∫x
a
f(z)dz.
If f is continuous at the point c in [ab], then F is differentiable at c and
F'(c) = f(c).
Similarly, define the function G on [ab] by
G(x) = ∫b
x
f(z)dz.
If f is continuous at the point c in [ab], then G is differentiable at c and
G'(c) = −f(c).
If f is integrable on [ab] and f = H' for some function H, then
b
a
f(z)dz = H(b) − H(a).
Source  hide
For a proof of the first two claims, see Spivak (1980), Theorem 1 on p. 268. For a proof of the last claim, see Spivak (1980), Theorem 2 on p. 272.

Antiderivatives and indefinite integration

The function H in the last part of this result is an “antiderivative” of f: a function whose derivative is f. (Some authors use the term “primitive” instead of “antiderivative”.)

Definition
Let f be a function of a single variable. An antiderivative of f is a function F for which F'(x) = f(x) for all x in the domain of f.

As an example, an antiderivative of the function 2x is the function x2. But x2 is not the only antiderivative of 2x: for any number c, the function x2 + c is an antiderivative, and in fact every antiderivative of 2x has this form. In general, if F is an antiderivative of f, then the function G is an antiderivative of f if and only if for some number c we have G(x) = F(x) + c for all x.

We refer to the set of all antiderivatives of a function f as the “indefinite integral” of f, which we denote by ∫f(x)dx. (This notation makes sense given the fundamental theorem of calculus.) For example, ∫2x dx = {F: F(x) = x2 + c for some number c, for all x}, which we usually write more briefly as ∫2x dx = x2 + c.

Definition
Let f be a function of a single variable. The indefinite integral of f, denoted
f(z)dz
or
f
is the set of antiderivatives of f.

For many functions, finding the indefinite integral is not easy. In fact, the indefinite integrals of many functions cannot be written as explicit formulae.

Some indefinite integrals that may be expressed simply are

xndx = xn + 1/(n+1) + c,
exdx = ex + c,
and
∫(1/x)dx = ln |x| + c.
A useful fact to employ when finding some integrals is that the derivative of ln f(x) is f'(x)/f(x) (an implication of the chain rule, discussed later). Thus if you can express the function you are integrating in the form f'(x)/f(x), its integral is ln f(x). For example,
x
x2 + 1
dx = (1/2)ln (x2 + 1) + c.
We can find the indefinite integral of some functions by using the following result.
Proposition 1.5.4 (Integration by parts)  proof
Let f and g be differentiable functions of a single variable whose derivatives are continuous. Then
f(x)g'(x)dx  =  f(x)g(x) − ∫f'(x)g(x)dx.
Proof  hide
Define the function h by h(x) = f(x)g(x) for all x. Then by the product rule for differentiation, h'(x) = f'(x)g(x) + f(x)g'(x). Given that the derivatives f' and g' are continuous, h' is continuous and hence integrable. Integrating h' we get h(x) = f(x)g(x) = ∫f'(x)g(x)dx + ∫f(x)g'(x)dx.
The result is useful if we can express the function we want to integrate as a product f(x)g'(x) and we can easily find the integral of f'(x)g(x). The following example illustrates this point.
Example 1.5.5
xexdx = xex − ∫exdx = xex − ex + c. We cannot integrate xex directly, but we can integrate the product of the derivative of x (namely 1) and the integral of ex (namely ex), because that product is simply ex.
The same formula may be used to calculate the definite integral of a function:
b
a
f(x)g'(x)dx
 =  f(b)g(b) − f(a)g(a) − b
a
f'(x)g(x)dx.
The next example illustrates this formula. It shows also how even when the function you are integrating naturally has only one “part”, integration by parts can sometimes be used to advantage by creating a second part equal to the constant function that is equal to 1 for all values of x.
Example 1.5.6
Let H be a cumulative probability distribution on [a,b]. That is, H is a nondecreasing function with H(a) = 0 and H(b) = 1. We can find an expression for
b
a
H(x)dx
by using integration by parts with f(x) = H(x) and g'(x) = 1. We have g(x) = x, so that
b
a
H(x)dx
 =  H(b)g(b) − H(a)g(a) − b
a
H'(x)x dx
and hence, given H(a) = 0 and H(b) = 1,
b
a
H(x)dx = b − ∫b
a
H'(x)x dx.
Now,
b
a
H'(x)x dx
is the expected value of the distribution, which we can denote EH(x). So we have
b
a
H(x)dx = b − EH(x),
an expression that is sometimes useful.
Another technique that is useful when calculating some integrals is changing the variable. Suppose we wish to calculate an integral that can be written as
f(g(x))g'(x)dx
for functions f and g, and we know the antiderivatives of f. Let y = g(x), so that (taking some liberties, justified by the next result) we have dy = g'(x)dx and the integral becomes
f(y)dy,
which is F(y) + C, where F is an antiderivative of f. Given y = g(x), this expression is equal to F(g(x)) + C, so that
f(g(x))g'(x)dx = F(g(x)) + C.
Proposition 1.5.5 (Change of variable in integration)  proof
Let g be a differentiable function of a single variable on the interval [ab], let f be a function of a single variable on the range of g, and let F be an antiderivative of f (that is, F' = f). Then
f(g(x))g'(x)dx  =  F(g(x)) + C
and if f and g' are continuous then
b
a
f(g(x))g'(x)dx
 =  F(g(b)) − F(g(a)) = ∫g(b)
g(a)
f(y)dy.
Proof  hide
Define the function h by h(x) = F(g(x)) for all x. By the chain rule, h'(x) = F'(g(x))g'(x) = f(g(x))g'(x), establishing the first claim.

By a previous result, a continuous function is integrable, so that f and the function k defined by k(x) = f(g(x))g'(x) are both integrable. Then by the fundamental theorem of calculus applied to the function k we have

b
a
f(g(x))g'(x)dx
 =  h(b) − h(a).
Now, h(b) − h(a) = F(g(b)) − F(g(a)) and the fundamental theorem of calculus applied to f yields
F(g(b)) − F(g(a))  =  g(b)
g(a)
f(x)dx.
Example 1.5.7
Consider the problem of finding the indefinite integral
rx(sx2 + t)kdx,
where r, s, t, and k are constants. Notice that the first term, rx, is proportional to the derivative of sx2 + t. Specifically, if we define g by g(x) = sx2 + t, so that g'(x) = 2sx, and f by f(z) = zk, then the integral is
(r/(2s))∫f(g(x))g'(x)dx.
By the result, this integral is
(r/(2s))(F(g(x)) + C),
where F is an antiderivative of f. The antiderivatives of f take the form zk+1/(k+1), so that the indefinite integral is
(r/(2s))(1/(k+1))(sx2 + t)k+1 + C.
(By differentiating this expression you can verify that it is correct.)

Another way to express this calculation is that we are making the substitution y = sx2 + t, so that dy = 2sx dx, which transforms

rx(sx2 + t)kdx
to
(r/(2s))∫ykdy = (r/(2s))(1/(k+1))yk+1 + C.
Substituting sx2 + t for y yields the function we found previously.

For the definite integral

b
a
rx(sx2 + t)kdx
we have
b
a
rx(sx2 + t)kdx
 =  (r/(2s))∫g(b)
g(a)
f(z)dz
 =  (r/(2s))(1/(k+1))[(g(b))k+1 − (g(a))k+1].
If, for example, a = 0 and b = 1, then the result is
(r/(2s))(1/(k + 1))[(s + t)k+1 − tk+1].