ACE 328/Chapter 10

The Implicit Function Theorem

When can an implicit equation F(x, y) = 0 be solved locally for y as a function of x? The implicit function theorem gives sufficient conditions via the non-singular Jacobian. Inverse function theorem as a special case.

Given an equation $F(x, y) = 0$ relating two variables, when can we solve for $y$ as a function of $x$ ? Globally, rarely: the unit circle $x^2 + y^2 - 1 = 0$ is not the graph of any single function $y = g(x)$ because two values of $y$ correspond to each $x \in (-1, 1)$ . But locally — in a neighbourhood of a given point on the curve — the answer is often yes, provided one partial derivative does not vanish. The implicit function theorem makes this precise, in arbitrary dimensions, and does so constructively: it proves the existence of the local solution function $g$ by applying the Banach fixed-point theorem to a contraction built from the data. It also gives an explicit formula for the derivative of $g$ , even without a closed-form expression for $g$ itself. Its special case, the inverse function theorem, is the local invertibility criterion for smooth maps. This chapter develops the theorem, proves it in full, and explores its geometric and computational consequences.

Motivation: The Circle

The unit circle $C = \{(x, y) \in \mathbb{R}^2 : x^2 + y^2 = 1\}$ is not globally a graph, but every point of $C$ has a neighbourhood in which $C$ is the graph of a function of one of the variables:

Near any point with $y > 0$ : $y = \sqrt{1 - x^2}$ .
Near any point with $y < 0$ : $y = -\sqrt{1 - x^2}$ .
Near any point with $x > 0$ : $x = \sqrt{1 - y^2}$ .
Near any point with $x < 0$ : $x = -\sqrt{1 - y^2}$ .

At $(\pm 1, 0)$ we cannot write $y$ as a function of $x$ , but we can write $x$ as a function of $y$ . The function $f(x, y) = x^2 + y^2 - 1$ has partials $f_x = 2x$ and $f_y = 2y$ . The inability to solve for $y$ in terms of $x$ at $(\pm 1, 0)$ is mirrored by $f_y = 2y = 0$ there. The inability to solve for $x$ in terms of $y$ at $(0, \pm 1)$ is mirrored by $f_x = 0$ there. The implicit function theorem promotes this observation to a general principle.

Two-Dimensional Implicit Function Theorem

TheoremDini's Implicit Function Theorem (2D, 1877)

Let $D \subseteq \mathbb{R}^2$ be open and $f : D \to \mathbb{R}$ of class $C^1$ on $D$ . Let $(x_0, y_0) \in D$ satisfy $f(x_0, y_0) = 0 \quad \text{and} \quad \frac{\partial f}{\partial y}(x_0, y_0) \neq 0.$ Then there exist $r > 0$ and a $C^1$ function $g : (x_0 - r, x_0 + r) \to \mathbb{R}$ such that $g(x_0) = y_0$ , and for every $(x, y)$ with $|x - x_0| < r$ and $|y - y_0| < r$ , $f(x, y) = 0 \iff y = g(x).$ Moreover, for every $x \in (x_0 - r, x_0 + r)$ , $\frac{\partial f}{\partial x}(x, g(x)) + \frac{\partial f}{\partial y}(x, g(x)) \cdot g'(x) = 0, \qquad \text{so } g'(x) = -\frac{\partial f / \partial x(x, g(x))}{\partial f / \partial y(x, g(x))}.$

Remark.

Intuition: The condition

f_y(x_0, y_0) \neq 0

says the level curve

\{f = 0\}

is not tangent to the

y

-axis at

(x_0, y_0)

. Infinitesimally, the linearization

f_x(x_0, y_0) \Delta x + f_y(x_0, y_0) \Delta y = 0

can be solved for

\Delta y

in terms of

\Delta x

iff

f_y \neq 0

. The theorem says this infinitesimal solvability extends to a full nonlinear local solution. The derivative formula is obtained by applying the chain rule to the identity

f(x, g(x)) \equiv 0

and solving for

g'

The 2D theorem follows from the higher-dimensional version proven below. We skip a direct proof and pass to the general case.

Preliminary Tools

The proof of the full implicit function theorem rests on two results we state and use.

TheoremMean Value Inequality for Vector-Valued Maps

Let $U \subseteq \mathbb{R}^m$ be open and $F : U \to \mathbb{R}^m$ differentiable on $U$ . Suppose the line segment $[p, q] = \{p + t(q - p) : 0 \leq t \leq 1\}$ is contained in $U$ . Then $\|F(q) - F(p)\| \leq M \|q - p\|, \qquad M = \sup_{x \in U} \|DF(x)\|_{\mathrm{op}},$ where $\|L\|_{\mathrm{op}} = \sup_{\|v\|=1} \|Lv\|$ is the operator norm.

TheoremBanach Fixed-Point Theorem (Contraction Mapping Principle)

Let $Y \subseteq \mathbb{R}^m$ be a non-empty closed subset, and let $K : Y \to Y$ be a contraction: there exists $0 \leq \lambda < 1$ such that for every $y_1, y_2 \in Y$ , $\|K(y_1) - K(y_2)\| \leq \lambda \|y_1 - y_2\|.$ Then there exists a unique $y^* \in Y$ with $K(y^*) = y^*$ .

Remark.

Intuition: A contraction shrinks distances by a factor

\lambda < 1

. Iterating it produces a Cauchy sequence (geometrically shrinking differences), which converges because

Y

is closed in the complete space

\mathbb{R}^m

. The limit must be fixed because

K

is continuous, and it is unique because two fixed points would have to be at distance

\leq \lambda \cdot

their own distance — forcing them to coincide.

The General Implicit Function Theorem

TheoremImplicit Function Theorem

Let $U \subseteq \mathbb{R}^{n+m} = \mathbb{R}^n \times \mathbb{R}^m$ be open, and let $(x_0, y_0) \in U$ . Let $f : U \to \mathbb{R}^m$ be of class $C^1$ in a neighbourhood of $(x_0, y_0)$ , with $f(x_0, y_0) = 0$ . Write $f = (f_1, \ldots, f_m)$ and introduce the $m \times n$ and $m \times m$ matrices $A = \left( \frac{\partial f_i}{\partial x_j}(x_0, y_0) \right)_{\substack{i = 1, \ldots, m \\ j = 1, \ldots, n}}, \qquad B = \left( \frac{\partial f_i}{\partial y_j}(x_0, y_0) \right)_{i, j = 1, \ldots, m}.$ Assume that $B$ is invertible. Then there exist an open neighbourhood $X$ of $x_0$ in $\mathbb{R}^n$ , an open neighbourhood of $y_0$ in $\mathbb{R}^m$ , and a $C^1$ function $g : X \to \mathbb{R}^m$ with $g(x_0) = y_0$ , such that for $(x, y)$ near $(x_0, y_0)$ , $f(x, y) = 0 \iff y = g(x).$ Moreover, $Dg(x) : \mathbb{R}^n \to \mathbb{R}^m$ is given by the matrix $-B_x^{-1} A_x, \qquad \text{where } A_x = \left( \frac{\partial f_i}{\partial x_j}(x, g(x)) \right), \quad B_x = \left( \frac{\partial f_i}{\partial y_j}(x, g(x)) \right).$

Remark.

Intuition: Split variables into "inputs"

x

and "outputs to be solved for"

y

. The hypothesis "

B

is invertible" says the infinitesimal system

A \Delta x + B \Delta y = 0

is uniquely solvable for

\Delta y

\Delta y = -B^{-1} A \Delta x

. The theorem lifts this to the nonlinear level: the map

x \mapsto y(x) = g(x)

exists locally, is

C^1

, and its derivative is the infinitesimal solution

-B^{-1} A

(evaluated at the current point). The proof rewrites the equation

f(x, y) = 0

as a fixed-point problem for

y

at each

x

and applies the contraction mapping principle.

The Inverse Function Theorem

We first pause for some terminology, then state the one-variable warm-up, then pass to the general case.

DefinitionHomeomorphism and Diffeomorphism

Let $U, V \subseteq \mathbb{R}^n$ be open.

A map $F : U \to V$ is a homeomorphism if it is a continuous bijection with continuous inverse.
For $r \geq 1$ , $F$ is a $C^r$ diffeomorphism if $F$ is a bijection of class $C^r$ whose inverse $F^{-1}$ is also of class $C^r$ .
$F$ is a local $C^r$ diffeomorphism at $a \in U$ if there exist open neighbourhoods $U_0 \subseteq U$ of $a$ and $V_0 \subseteq V$ of $F(a)$ such that $F|_{U_0} : U_0 \to V_0$ is a $C^r$ diffeomorphism.

Remark.

Intuition: A homeomorphism preserves topological structure; a diffeomorphism preserves smooth structure. Local diffeomorphisms are the natural analogues of "locally invertible with differentiable inverse" for

\mathbb{R}^n

The One-Variable Inverse Function Theorem

Theorem1D Inverse Function Theorem

Let $J \subseteq \mathbb{R}$ be an open interval, $a \in J$ , and $f : J \to \mathbb{R}$ be of class $C^1$ . If $f'(a) \neq 0$ (equivalently, the linear map $\mathbb{R} \ni v \mapsto f'(a) v \in \mathbb{R}$ is invertible), then $f$ is locally invertible on some interval $I \subseteq J$ containing $a$ . The inverse $f^{-1} : f(I) \to I$ is of class $C^1$ , and $(f^{-1})'(y) = \frac{1}{f'(f^{-1}(y))} \qquad \text{for every } y \in f(I).$ In other words, $f$ is a local $C^1$ diffeomorphism at $a$ .

Remark.

Intuition:

f'(a) \neq 0

means

f

is strictly monotonic near

a

, so it is injective near

a

; continuity of

f

then hands us a continuous inverse, and the formula for

(f^{-1})'

follows by differentiating

f(f^{-1}(y)) = y

The Multivariable Version

The multivariable inverse function theorem is a special case of the implicit function theorem in which $f(x, y) = F(y) - x$ , so solving $f(x, y) = 0$ means inverting $F$ .

TheoremInverse Function Theorem

Let $V \subseteq \mathbb{R}^m$ be open, $F : V \to \mathbb{R}^m$ of class $C^1$ , and $y_0 \in V$ with $DF(y_0)$ invertible. Set $x_0 = F(y_0)$ . Then there exist an open neighbourhood $W$ of $x_0$ and an open neighbourhood $V_0 \subseteq V$ of $y_0$ , and a $C^1$ function $F^{-1} : W \to V_0$ such that $F^{-1}(F(y)) = y$ for $y \in V_0$ and $F(F^{-1}(x)) = x$ for $x \in W$ . Moreover, $D(F^{-1})(x) = (DF(F^{-1}(x)))^{-1}.$

Remark.

Intuition: If the derivative

DF(y_0)

is invertible, then to first order

F

acts as an invertible linear map near

y_0

. The theorem says this local linear invertibility lifts to local nonlinear invertibility. The derivative formula is the infinitesimal version of the inverse: the derivative of the inverse is the inverse of the derivative.

Remark.

An alternative construction (Lecture 27). The reduction "inverse function = implicit function with

f(x, y) = F(y) - x

" is often presented in the opposite direction: one starts with the target equation

F(y) = x

, defines an auxiliary function

\Phi : U \times \mathbb{R}^m \to \mathbb{R}^m

\Phi(x, y) = F(y) - x

, notes that

\Phi

C^1

with

\Phi(x_0, y_0) = 0

and

\partial \Phi / \partial y(x_0, y_0) = DF(y_0)

invertible, and then extracts from the implicit function theorem a

C^1

map

h

on a neighbourhood

W

x_0

with

\Phi(x, h(x)) = 0

, i.e.

F(h(x)) = x

. So

h

is a right inverse of

F

. The chain rule applied to

F \circ h = \mathrm{id}_W

gives

DF(h(x)) \circ Dh(x) = I

, so

Dh(x)

is invertible; applying the same argument to

h

(with derivative

DF(h(x))^{-1}

) produces a right inverse

g

h

. Then

F = F \circ (h \circ g) = (F \circ h) \circ g = g

locally, so

h

is a two-sided inverse of

F

, i.e.

h = F^{-1}

. This mirrors exactly how one proves the one-variable version.

CorollaryDerivative of the Inverse via Chain Rule

Suppose $F : U \to \mathbb{R}^m$ is of class $C^1$ and has a $C^1$ inverse $F^{-1}$ defined on a neighbourhood of $b = F(a)$ . Then $D(F^{-1})(b) = \bigl(DF(a)\bigr)^{-1}.$

Remark.

Intuition: Once we know that

F^{-1}

exists and is differentiable, the derivative formula comes for free from the chain rule. The hard content of the inverse function theorem is the existence of

F^{-1}

; the formula is an easy consequence.

Geometric Interpretation: Level Sets as Smooth Graphs

Let $f : U \to \mathbb{R}^m$ with $U \subseteq \mathbb{R}^{n+m}$ open, $f$ of class $C^1$ . The level set $\{(x, y) \in U : f(x, y) = 0\}$ is a subset of $\mathbb{R}^{n+m}$ . The implicit function theorem says that, at points where the partial Jacobian with respect to the last $m$ variables is invertible, the level set is locally the graph of a $C^1$ function $g : \mathbb{R}^n \to \mathbb{R}^m$ . Such a subset of $\mathbb{R}^{n+m}$ — locally the graph of a smooth function — is called a smooth $n$ -dimensional submanifold of $\mathbb{R}^{n+m}$ .

More generally, given $F : U \to \mathbb{R}^m$ with $U \subseteq \mathbb{R}^N$ , if $DF(p)$ has full rank $m$ at some point $p$ with $F(p) = 0$ , then there is a choice of $m$ coordinates in $\mathbb{R}^N$ whose partials give an invertible $m \times m$ submatrix; after reindexing, the implicit function theorem applies and the level set $\{F = 0\}$ is locally the graph of a $C^1$ function of the remaining $N - m = n$ coordinates. Maps $F$ whose derivative has rank $m$ at every point of $F^{-1}(0)$ are called submersions, and the theorem says:

The zero level set of a submersion is a smooth manifold of dimension $N - m$ .

This is the bedrock of differential geometry.

Worked Examples

ExampleThe unit circle, revisited

Let $f(x, y) = x^2 + y^2 - 1$ . Then $f_x = 2x$ and $f_y = 2y$ . Consider the point $(x_0, y_0) = (\sqrt{2}/2, \sqrt{2}/2)$ . We have $f(x_0, y_0) = 1/2 + 1/2 - 1 = 0$ and $f_y(x_0, y_0) = \sqrt{2} \neq 0$ . The implicit function theorem gives a $C^1$ function $g$ on a neighbourhood of $x_0 = \sqrt{2}/2$ such that $f(x, g(x)) = 0$ . The derivative formula yields $g'(x_0) = -\frac{f_x(x_0, y_0)}{f_y(x_0, y_0)} = -\frac{2 x_0}{2 y_0} = -1.$ This is the slope of the tangent line to the unit circle at $(\sqrt{2}/2, \sqrt{2}/2)$ , as expected.

At the point $(1, 0)$ we have $f_y = 0$ , so the theorem fails — and indeed near $(1, 0)$ the circle cannot be written as $y = g(x)$ (two branches meet). However, $f_x(1, 0) = 2 \neq 0$ , so swapping the roles of $x$ and $y$ , the theorem gives $x = h(y)$ near $(1, 0)$ , with $h'(0) = -f_y/f_x = 0$ (a vertical tangent).

ExampleAn implicit curve without a closed form

Let $f : \mathbb{R}^2 \to \mathbb{R}$ be $f(x, y) = \frac{1}{6}\left( 2\sqrt{3} \arctan\left( \frac{2y - 1}{\sqrt{3}} \right) + \log\left( \frac{(1 + y)^2}{1 - y + y^2} \right) \right) - x - \frac{\sqrt{3} \pi + \log 64}{18}.$ We take $(x_0, y_0) = (0, 1)$ . Direct substitution (using $\arctan(1/\sqrt{3}) = \pi/6$ and $\log(4) = \log 4$ , $\log 64 = 6 \log 2$ ) gives $f(0, 1) = 0$ . The partials are $\frac{\partial f}{\partial x} = -1, \qquad \frac{\partial f}{\partial y} = \frac{1}{y^3 + 1}.$ Take $D = \mathbb{R} \times (-1, \infty)$ so $y^3 + 1 > 0$ ; $f$ is $C^1$ on $D$ . At $(0, 1)$ , $\partial f / \partial y = 1/2 \neq 0$ . By the implicit function theorem, there is $r > 0$ and a $C^1$ function $g : (-r, r) \to \mathbb{R}$ with $g(0) = 1$ and $f(x, g(x)) = 0$ . No closed form for $g$ is available, but the derivative formula yields $g'(x) = -\frac{-1}{1/(g(x)^3 + 1)} = g(x)^3 + 1.$ In particular $g'(0) = g(0)^3 + 1 = 2$ . We have discovered that $g$ is the solution to the initial-value problem $g'(x) = g(x)^3 + 1, \qquad g(0) = 1.$

ExampleThe sphere in R^3

Let $f(x, y, z) = x^2 + y^2 + z^2 - 1$ , the unit sphere. Here $n = 2$ and $m = 1$ : we want to solve for $z$ in terms of $(x, y)$ . At any $(x_0, y_0, z_0)$ on the sphere with $z_0 \neq 0$ , $\partial f / \partial z = 2 z_0 \neq 0$ , so the implicit function theorem gives $z = g(x, y)$ locally, with $\frac{\partial g}{\partial x}(x_0, y_0) = -\frac{f_x}{f_z} = -\frac{x_0}{z_0}, \qquad \frac{\partial g}{\partial y}(x_0, y_0) = -\frac{y_0}{z_0}.$ The upper hemisphere is globally $z = \sqrt{1 - x^2 - y^2}$ . The theorem fails on the equator $\{z = 0\}$ , and indeed the sphere is not locally a graph $z = g(x, y)$ near equatorial points.

ExampleA system of two equations in three unknowns

Consider the system $\begin{cases} f_1(x, y_1, y_2) = x^2 + y_1^2 + y_2^2 - 3 = 0, \\ f_2(x, y_1, y_2) = x + y_1 + y_2 - 3 = 0, \end{cases}$ at the point $(x_0, y_1^0, y_2^0) = (1, 1, 1)$ . Both equations are satisfied. Here $n = 1$ , $m = 2$ . The partial Jacobian in $(y_1, y_2)$ is $B = \begin{pmatrix} 2 y_1 & 2 y_2 \\ 1 & 1 \end{pmatrix} \Bigg|_{(1,1,1)} = \begin{pmatrix} 2 & 2 \\ 1 & 1 \end{pmatrix}, \qquad \det B = 0.$ So the hypothesis fails: we cannot solve for both $y_1, y_2$ in terms of $x$ alone near $(1,1,1)$ . Geometrically, at this point the sphere and plane intersect tangentially rather than transversely. Near generic intersection points (e.g.\ the circle of intersection of the sphere and plane is generic away from its tangencies), $B$ is invertible and we can locally solve.

Applications

Parametrizing Curves and Surfaces

The implicit function theorem is the fundamental tool for turning implicit descriptions of geometric objects into local parametrizations:

Curves in $\mathbb{R}^2$ described by $f(x, y) = 0$ : locally a graph $y = g(x)$ or $x = h(y)$ wherever $\nabla f \neq 0$ .
Surfaces in $\mathbb{R}^3$ described by $f(x, y, z) = 0$ : locally a graph $z = g(x, y)$ (or similar) wherever $\nabla f \neq 0$ .
Curves in $\mathbb{R}^3$ described as the intersection of two surfaces $f_1 = f_2 = 0$ : locally parametrized by one of the coordinates wherever the $2 \times 2$ minor of $(\nabla f_1, \nabla f_2)$ with respect to the other two coordinates is invertible.

Preparing for Lagrange Multipliers

The Lagrange multiplier method for constrained optimization — minimize $f(x)$ subject to $g(x) = 0$ — relies on the implicit function theorem. At a minimum $x^*$ of $f$ on the constraint set $\{g = 0\}$ , assuming $\nabla g(x^*) \neq 0$ , the implicit function theorem lets us parametrize the constraint set locally as a smooth surface, which lets us check that directional derivatives of $f$ along the tangent space vanish. This forces $\nabla f(x^*)$ to be normal to the constraint surface, i.e.\ parallel to $\nabla g(x^*)$ , yielding the multiplier equation $\nabla f = \lambda \nabla g$ . A full treatment is the subject of the next chapter.

The implicit function theorem is one of the most consequential theorems in analysis: it bridges algebra (solving equations) and geometry (level sets as manifolds), it underpins the inverse function theorem and Lagrange multipliers, it is the starting point of differential topology and differential geometry, and its proof via Banach's fixed-point theorem illustrates the deep power of contraction mappings — a strategy that recurs throughout analysis, from solving differential equations (Picard iteration) to the spectral theory of operators.