Taylor Expansion and Extrema
Taylor expansions for functions of several variables, with integral and Lagrange remainders. Critical points, the Hessian matrix, and the second derivative test for classifying extrema.
In this chapter we develop the higher-order differential calculus of scalar-valued functions of several real variables. After recalling the partial and total derivatives, we prove Clairaut's theorem on the equality of mixed partials, derive the multivariable Taylor expansion with both integral and Lagrange remainders, and use this expansion to analyze critical points via the Hessian matrix. The culmination is the second derivative test, which classifies nondegenerate critical points as local minima, local maxima, or saddle points via the spectral theorem for symmetric matrices.
Partial Derivatives and the Total Derivative
We briefly recall the two notions of derivative for a function of several variables.
Let be open, , and . For , the -th partial derivative of at is whenever this limit exists, where is the -th standard basis vector.
Intuition: The partial derivative measures the rate of change of in the direction of the -th coordinate axis. All other variables are held fixed, so this reduces to a single-variable derivative.
Let open and . We say is differentiable at if there exists a linear map such that The linear map is called the total derivative (or differential) of at .
Intuition: Differentiability is a stronger notion than the existence of partial derivatives. It asks for a single linear map which approximates well in every direction simultaneously. When is differentiable, is represented by the gradient:
If all partial derivatives of exist and are continuous on (that is, ), then is differentiable at every point of .
Higher Order Partial Derivatives
We now iterate the process of taking partial derivatives.
Let with open. If exists on a neighbourhood of and is itself differentiable in direction at , we define the second order partial derivative By iteration, for a multi-index with we define We say is of class on (written ) if all partial derivatives of up to order exist and are continuous on .
Intuition: Class functions have continuous derivatives in every mix of directions. The notation means "first differentiate with respect to , then with respect to ." A priori, this is different from reversing the order. Clairaut's theorem tells us that sufficient smoothness makes the order irrelevant.
Let open and . Suppose , , , and all exist and are continuous on . Then In particular, if , mixed partials commute.
Intuition: Differentiation with respect to different variables can be performed in any order, provided the second derivatives are continuous. This turns the Hessian matrix into a symmetric matrix, which will be crucial for the spectral argument in the second derivative test.
Multivariable Taylor Expansion
We first recall the single-variable Taylor theorem, then lift it to several variables by restricting to a line.
Let open, of class for some . Let and with . Then where the integral remainder is
Intuition: The Taylor polynomial approximates at to order , and the remainder records the error. The integral form of the remainder has a clean inductive proof by repeated integration by parts and is the cleanest remainder to lift to several variables.
We now use this to expand a multivariable function by restricting to a line.
Let open and of class . Let and small enough that for all . Then where is the directional differential operator, and
Intuition: The multivariable Taylor polynomial is obtained by applying the "total directional derivative" operator repeatedly. Each application of brings down a factor of by the chain rule, producing all mixed partials weighted by appropriate products of the components of .
Second-Order Expansion and the Hessian
The case (so ) is the one we need for the second derivative test. Expanding and , we obtain:
Let of class and . Then for small , where is the Hessian matrix of at .
Intuition: The Hessian plays the role of the second derivative for a scalar-valued function of several variables. Because , Clairaut's theorem makes symmetric. The quadratic form captures the leading-order curvature of at .
Bound on the Remainder
Suppose and all third order partial derivatives of are bounded in absolute value by on a neighbourhood of . Then for sufficiently small, where .
Let , . We compute the Taylor expansion of to second order about .
The partial derivatives at are: Evaluating at : Noting , the second-order Taylor expansion reads Since is a polynomial, we may compute the remainder exactly: so , which is a cubic monomial and hence not part of the second-order Taylor polynomial.
For an upper bound: all third-order partials of equal except . So , and with ,
Numerical consequence. If , then without computing the remainder explicitly we know so the true value of differs from the Taylor approximation by at most . This is the practical payoff of the Lagrange bound: we can guarantee a quantitative error bound for the approximation without ever computing the error explicitly.
Relative Extrema and Critical Points
Let open, , .
- has a relative (local) minimum at iff for all in some neighbourhood of .
- has a relative (local) maximum at iff for all in some neighbourhood of .
- has a saddle point at iff in every neighbourhood of there exist and with .
- has a relative extremum at iff it has a relative min or max at .
Intuition: The first two cases extend the familiar local min/max from calculus. A saddle point sits in between: along some directions increases away from , along others it decreases. Saddles are the genuinely higher-dimensional phenomenon.
We first recall the single-variable version, since the multivariable result reduces to it along each coordinate direction.
Let and suppose has either a relative maximum or a relative minimum at . If exists, then .
Let open, . If has a relative extremum at and is differentiable at , then .
Intuition: This is the natural multivariable Fermat theorem. If we can approach from every direction and has an extremum there, then the directional derivative must vanish in every direction — equivalently, the gradient vanishes.
A point is a critical point of iff .
Intuition: Critical points are the candidates for local extrema and saddle points. The vanishing gradient condition is necessary but not sufficient: not every critical point is an extremum.
The Hessian and Definiteness
Let be a symmetric real matrix. We say is
- positive definite iff for all ;
- negative definite iff for all ;
- positive semi-definite iff for all ;
- negative semi-definite iff for all ;
- indefinite iff there exist with and .
Intuition: The sign of the quadratic form encodes how curvature bends. Positive definite matrices curve upward in every direction, negative definite curve downward. Indefinite matrices curve up in some directions and down in others — the hallmark of a saddle.
Let be a symmetric real matrix with eigenvalues . Then
- is positive definite iff for all ;
- is negative definite iff for all ;
- is indefinite iff has both a positive and a negative eigenvalue;
- is nonsingular iff no eigenvalue equals , i.e. iff .
The Second Derivative Test
We can now state and prove the multivariable second derivative test, which is the main application of the Taylor expansion.
Let open and . Let be a critical point of (so ) and assume is nonsingular. Then:
(i) If is positive definite, has a relative minimum at .
(ii) If is negative definite, has a relative maximum at .
(iii) If is indefinite, has a saddle point at .
Intuition: Because , the leading order behaviour of near is controlled by the quadratic form . The remainder is of order and becomes negligible for small . The definiteness of the Hessian therefore determines whether curves up, down, or in different directions, locally.
Intuition: When , the test is inconclusive: higher-order terms in the Taylor expansion are needed to decide the nature of the critical point. Examples such as , , and at the origin show that a critical point with zero Hessian determinant may be a min, max, or saddle.
Computational note. Checking whether and computing the eigenvalues (or equivalently the signs of the leading principal minors, by Sylvester's criterion) can be done quite easily numerically provided is not too large. In two variables, with , the test simplifies to:
- and : local minimum.
- and : local maximum.
- : saddle.
- : inconclusive.
A Worked Example
Classify all critical points of the function given by
Step 1: Find the critical points. We have Setting both to zero gives and , so the six critical points are
Step 2: Compute the Hessian. We have Thus , which is diagonal, so its eigenvalues are the diagonal entries.
Step 3: Classify each point.
- At : , indefinite, so is a saddle point.
- At : , negative definite, so it is a relative maximum.
- At : , positive definite, so these are relative minima.
- At : , indefinite, so these are saddle points.
Intuition: In low dimensions the Hessian is diagonal or , and one can often see the classification at a glance. In higher dimensions one computes eigenvalues (or equivalently the principal minors) numerically. The key fact is that the second derivative test reduces the multivariable problem to an eigenvalue problem for a symmetric matrix.