ACE 328/Chapter 10

The Implicit Function Theorem

When can an implicit equation F(x, y) = 0 be solved locally for y as a function of x? The implicit function theorem gives sufficient conditions via the non-singular Jacobian. Inverse function theorem as a special case.

Given an equation F(x,y)=0F(x, y) = 0 relating two variables, when can we solve for yy as a function of xx? Globally, rarely: the unit circle x2+y21=0x^2 + y^2 - 1 = 0 is not the graph of any single function y=g(x)y = g(x) because two values of yy correspond to each x(1,1)x \in (-1, 1). But locally — in a neighbourhood of a given point on the curve — the answer is often yes, provided one partial derivative does not vanish. The implicit function theorem makes this precise, in arbitrary dimensions, and does so constructively: it proves the existence of the local solution function gg by applying the Banach fixed-point theorem to a contraction built from the data. It also gives an explicit formula for the derivative of gg, even without a closed-form expression for gg itself. Its special case, the inverse function theorem, is the local invertibility criterion for smooth maps. This chapter develops the theorem, proves it in full, and explores its geometric and computational consequences.


Motivation: The Circle

The unit circle C={(x,y)R2:x2+y2=1}C = \{(x, y) \in \mathbb{R}^2 : x^2 + y^2 = 1\} is not globally a graph, but every point of CC has a neighbourhood in which CC is the graph of a function of one of the variables:

  • Near any point with y>0y > 0: y=1x2y = \sqrt{1 - x^2}.
  • Near any point with y<0y < 0: y=1x2y = -\sqrt{1 - x^2}.
  • Near any point with x>0x > 0: x=1y2x = \sqrt{1 - y^2}.
  • Near any point with x<0x < 0: x=1y2x = -\sqrt{1 - y^2}.

At (±1,0)(\pm 1, 0) we cannot write yy as a function of xx, but we can write xx as a function of yy. The function f(x,y)=x2+y21f(x, y) = x^2 + y^2 - 1 has partials fx=2xf_x = 2x and fy=2yf_y = 2y. The inability to solve for yy in terms of xx at (±1,0)(\pm 1, 0) is mirrored by fy=2y=0f_y = 2y = 0 there. The inability to solve for xx in terms of yy at (0,±1)(0, \pm 1) is mirrored by fx=0f_x = 0 there. The implicit function theorem promotes this observation to a general principle.


Two-Dimensional Implicit Function Theorem

TheoremDini's Implicit Function Theorem (2D, 1877)

Let DR2D \subseteq \mathbb{R}^2 be open and f:DRf : D \to \mathbb{R} of class C1C^1 on DD. Let (x0,y0)D(x_0, y_0) \in D satisfy f(x0,y0)=0andfy(x0,y0)0.f(x_0, y_0) = 0 \quad \text{and} \quad \frac{\partial f}{\partial y}(x_0, y_0) \neq 0. Then there exist r>0r > 0 and a C1C^1 function g:(x0r,x0+r)Rg : (x_0 - r, x_0 + r) \to \mathbb{R} such that g(x0)=y0g(x_0) = y_0, and for every (x,y)(x, y) with xx0<r|x - x_0| < r and yy0<r|y - y_0| < r, f(x,y)=0    y=g(x).f(x, y) = 0 \iff y = g(x). Moreover, for every x(x0r,x0+r)x \in (x_0 - r, x_0 + r), fx(x,g(x))+fy(x,g(x))g(x)=0,so g(x)=f/x(x,g(x))f/y(x,g(x)).\frac{\partial f}{\partial x}(x, g(x)) + \frac{\partial f}{\partial y}(x, g(x)) \cdot g'(x) = 0, \qquad \text{so } g'(x) = -\frac{\partial f / \partial x(x, g(x))}{\partial f / \partial y(x, g(x))}.

Remark.
Intuition: The condition fy(x0,y0)0f_y(x_0, y_0) \neq 0 says the level curve {f=0}\{f = 0\} is not tangent to the yy-axis at (x0,y0)(x_0, y_0). Infinitesimally, the linearization fx(x0,y0)Δx+fy(x0,y0)Δy=0f_x(x_0, y_0) \Delta x + f_y(x_0, y_0) \Delta y = 0 can be solved for Δy\Delta y in terms of Δx\Delta x iff fy0f_y \neq 0. The theorem says this infinitesimal solvability extends to a full nonlinear local solution. The derivative formula is obtained by applying the chain rule to the identity f(x,g(x))0f(x, g(x)) \equiv 0 and solving for gg'.

The 2D theorem follows from the higher-dimensional version proven below. We skip a direct proof and pass to the general case.


Preliminary Tools

The proof of the full implicit function theorem rests on two results we state and use.

TheoremMean Value Inequality for Vector-Valued Maps

Let URmU \subseteq \mathbb{R}^m be open and F:URmF : U \to \mathbb{R}^m differentiable on UU. Suppose the line segment [p,q]={p+t(qp):0t1}[p, q] = \{p + t(q - p) : 0 \leq t \leq 1\} is contained in UU. Then F(q)F(p)Mqp,M=supxUDF(x)op,\|F(q) - F(p)\| \leq M \|q - p\|, \qquad M = \sup_{x \in U} \|DF(x)\|_{\mathrm{op}}, where Lop=supv=1Lv\|L\|_{\mathrm{op}} = \sup_{\|v\|=1} \|Lv\| is the operator norm.

TheoremBanach Fixed-Point Theorem (Contraction Mapping Principle)

Let YRmY \subseteq \mathbb{R}^m be a non-empty closed subset, and let K:YYK : Y \to Y be a contraction: there exists 0λ<10 \leq \lambda < 1 such that for every y1,y2Yy_1, y_2 \in Y, K(y1)K(y2)λy1y2.\|K(y_1) - K(y_2)\| \leq \lambda \|y_1 - y_2\|. Then there exists a unique yYy^* \in Y with K(y)=yK(y^*) = y^*.

Remark.
Intuition: A contraction shrinks distances by a factor λ<1\lambda < 1. Iterating it produces a Cauchy sequence (geometrically shrinking differences), which converges because YY is closed in the complete space Rm\mathbb{R}^m. The limit must be fixed because KK is continuous, and it is unique because two fixed points would have to be at distance λ\leq \lambda \cdot their own distance — forcing them to coincide.

The General Implicit Function Theorem

TheoremImplicit Function Theorem

Let URn+m=Rn×RmU \subseteq \mathbb{R}^{n+m} = \mathbb{R}^n \times \mathbb{R}^m be open, and let (x0,y0)U(x_0, y_0) \in U. Let f:URmf : U \to \mathbb{R}^m be of class C1C^1 in a neighbourhood of (x0,y0)(x_0, y_0), with f(x0,y0)=0f(x_0, y_0) = 0. Write f=(f1,,fm)f = (f_1, \ldots, f_m) and introduce the m×nm \times n and m×mm \times m matrices A=(fixj(x0,y0))i=1,,mj=1,,n,B=(fiyj(x0,y0))i,j=1,,m.A = \left( \frac{\partial f_i}{\partial x_j}(x_0, y_0) \right)_{\substack{i = 1, \ldots, m \\ j = 1, \ldots, n}}, \qquad B = \left( \frac{\partial f_i}{\partial y_j}(x_0, y_0) \right)_{i, j = 1, \ldots, m}. Assume that BB is invertible. Then there exist an open neighbourhood XX of x0x_0 in Rn\mathbb{R}^n, an open neighbourhood of y0y_0 in Rm\mathbb{R}^m, and a C1C^1 function g:XRmg : X \to \mathbb{R}^m with g(x0)=y0g(x_0) = y_0, such that for (x,y)(x, y) near (x0,y0)(x_0, y_0), f(x,y)=0    y=g(x).f(x, y) = 0 \iff y = g(x). Moreover, Dg(x):RnRmDg(x) : \mathbb{R}^n \to \mathbb{R}^m is given by the matrix Bx1Ax,where Ax=(fixj(x,g(x))),Bx=(fiyj(x,g(x))).-B_x^{-1} A_x, \qquad \text{where } A_x = \left( \frac{\partial f_i}{\partial x_j}(x, g(x)) \right), \quad B_x = \left( \frac{\partial f_i}{\partial y_j}(x, g(x)) \right).

Remark.
Intuition: Split variables into "inputs" xx and "outputs to be solved for" yy. The hypothesis "BB is invertible" says the infinitesimal system AΔx+BΔy=0A \Delta x + B \Delta y = 0 is uniquely solvable for Δy\Delta y: Δy=B1AΔx\Delta y = -B^{-1} A \Delta x. The theorem lifts this to the nonlinear level: the map xy(x)=g(x)x \mapsto y(x) = g(x) exists locally, is C1C^1, and its derivative is the infinitesimal solution B1A-B^{-1} A (evaluated at the current point). The proof rewrites the equation f(x,y)=0f(x, y) = 0 as a fixed-point problem for yy at each xx and applies the contraction mapping principle.

The Inverse Function Theorem

We first pause for some terminology, then state the one-variable warm-up, then pass to the general case.

DefinitionHomeomorphism and Diffeomorphism

Let U,VRnU, V \subseteq \mathbb{R}^n be open.

  • A map F:UVF : U \to V is a homeomorphism if it is a continuous bijection with continuous inverse.
  • For r1r \geq 1, FF is a CrC^r diffeomorphism if FF is a bijection of class CrC^r whose inverse F1F^{-1} is also of class CrC^r.
  • FF is a local CrC^r diffeomorphism at aUa \in U if there exist open neighbourhoods U0UU_0 \subseteq U of aa and V0VV_0 \subseteq V of F(a)F(a) such that FU0:U0V0F|_{U_0} : U_0 \to V_0 is a CrC^r diffeomorphism.
Remark.
Intuition: A homeomorphism preserves topological structure; a diffeomorphism preserves smooth structure. Local diffeomorphisms are the natural analogues of "locally invertible with differentiable inverse" for Rn\mathbb{R}^n.

The One-Variable Inverse Function Theorem

Theorem1D Inverse Function Theorem

Let JRJ \subseteq \mathbb{R} be an open interval, aJa \in J, and f:JRf : J \to \mathbb{R} be of class C1C^1. If f(a)0f'(a) \neq 0 (equivalently, the linear map Rvf(a)vR\mathbb{R} \ni v \mapsto f'(a) v \in \mathbb{R} is invertible), then ff is locally invertible on some interval IJI \subseteq J containing aa. The inverse f1:f(I)If^{-1} : f(I) \to I is of class C1C^1, and (f1)(y)=1f(f1(y))for every yf(I).(f^{-1})'(y) = \frac{1}{f'(f^{-1}(y))} \qquad \text{for every } y \in f(I). In other words, ff is a local C1C^1 diffeomorphism at aa.

Remark.
Intuition: f(a)0f'(a) \neq 0 means ff is strictly monotonic near aa, so it is injective near aa; continuity of ff then hands us a continuous inverse, and the formula for (f1)(f^{-1})' follows by differentiating f(f1(y))=yf(f^{-1}(y)) = y.

The Multivariable Version

The multivariable inverse function theorem is a special case of the implicit function theorem in which f(x,y)=F(y)xf(x, y) = F(y) - x, so solving f(x,y)=0f(x, y) = 0 means inverting FF.

TheoremInverse Function Theorem

Let VRmV \subseteq \mathbb{R}^m be open, F:VRmF : V \to \mathbb{R}^m of class C1C^1, and y0Vy_0 \in V with DF(y0)DF(y_0) invertible. Set x0=F(y0)x_0 = F(y_0). Then there exist an open neighbourhood WW of x0x_0 and an open neighbourhood V0VV_0 \subseteq V of y0y_0, and a C1C^1 function F1:WV0F^{-1} : W \to V_0 such that F1(F(y))=yF^{-1}(F(y)) = y for yV0y \in V_0 and F(F1(x))=xF(F^{-1}(x)) = x for xWx \in W. Moreover, D(F1)(x)=(DF(F1(x)))1.D(F^{-1})(x) = (DF(F^{-1}(x)))^{-1}.

Remark.
Intuition: If the derivative DF(y0)DF(y_0) is invertible, then to first order FF acts as an invertible linear map near y0y_0. The theorem says this local linear invertibility lifts to local nonlinear invertibility. The derivative formula is the infinitesimal version of the inverse: the derivative of the inverse is the inverse of the derivative.
Remark.
An alternative construction (Lecture 27). The reduction "inverse function = implicit function with f(x,y)=F(y)xf(x, y) = F(y) - x" is often presented in the opposite direction: one starts with the target equation F(y)=xF(y) = x, defines an auxiliary function Φ:U×RmRm\Phi : U \times \mathbb{R}^m \to \mathbb{R}^m by Φ(x,y)=F(y)x\Phi(x, y) = F(y) - x, notes that Φ\Phi is C1C^1 with Φ(x0,y0)=0\Phi(x_0, y_0) = 0 and Φ/y(x0,y0)=DF(y0)\partial \Phi / \partial y(x_0, y_0) = DF(y_0) invertible, and then extracts from the implicit function theorem a C1C^1 map hh on a neighbourhood WW of x0x_0 with Φ(x,h(x))=0\Phi(x, h(x)) = 0, i.e. F(h(x))=xF(h(x)) = x. So hh is a right inverse of FF. The chain rule applied to Fh=idWF \circ h = \mathrm{id}_W gives DF(h(x))Dh(x)=IDF(h(x)) \circ Dh(x) = I, so Dh(x)Dh(x) is invertible; applying the same argument to hh (with derivative DF(h(x))1DF(h(x))^{-1}) produces a right inverse gg of hh. Then F=F(hg)=(Fh)g=gF = F \circ (h \circ g) = (F \circ h) \circ g = g locally, so hh is a two-sided inverse of FF, i.e. h=F1h = F^{-1}. This mirrors exactly how one proves the one-variable version.
CorollaryDerivative of the Inverse via Chain Rule

Suppose F:URmF : U \to \mathbb{R}^m is of class C1C^1 and has a C1C^1 inverse F1F^{-1} defined on a neighbourhood of b=F(a)b = F(a). Then D(F1)(b)=(DF(a))1.D(F^{-1})(b) = \bigl(DF(a)\bigr)^{-1}.

Remark.
Intuition: Once we know that F1F^{-1} exists and is differentiable, the derivative formula comes for free from the chain rule. The hard content of the inverse function theorem is the existence of F1F^{-1}; the formula is an easy consequence.

Geometric Interpretation: Level Sets as Smooth Graphs

Let f:URmf : U \to \mathbb{R}^m with URn+mU \subseteq \mathbb{R}^{n+m} open, ff of class C1C^1. The level set {(x,y)U:f(x,y)=0}\{(x, y) \in U : f(x, y) = 0\} is a subset of Rn+m\mathbb{R}^{n+m}. The implicit function theorem says that, at points where the partial Jacobian with respect to the last mm variables is invertible, the level set is locally the graph of a C1C^1 function g:RnRmg : \mathbb{R}^n \to \mathbb{R}^m. Such a subset of Rn+m\mathbb{R}^{n+m} — locally the graph of a smooth function — is called a smooth nn-dimensional submanifold of Rn+m\mathbb{R}^{n+m}.

More generally, given F:URmF : U \to \mathbb{R}^m with URNU \subseteq \mathbb{R}^N, if DF(p)DF(p) has full rank mm at some point pp with F(p)=0F(p) = 0, then there is a choice of mm coordinates in RN\mathbb{R}^N whose partials give an invertible m×mm \times m submatrix; after reindexing, the implicit function theorem applies and the level set {F=0}\{F = 0\} is locally the graph of a C1C^1 function of the remaining Nm=nN - m = n coordinates. Maps FF whose derivative has rank mm at every point of F1(0)F^{-1}(0) are called submersions, and the theorem says:

The zero level set of a submersion is a smooth manifold of dimension NmN - m.

This is the bedrock of differential geometry.


Worked Examples

ExampleThe unit circle, revisited

Let f(x,y)=x2+y21f(x, y) = x^2 + y^2 - 1. Then fx=2xf_x = 2x and fy=2yf_y = 2y. Consider the point (x0,y0)=(2/2,2/2)(x_0, y_0) = (\sqrt{2}/2, \sqrt{2}/2). We have f(x0,y0)=1/2+1/21=0f(x_0, y_0) = 1/2 + 1/2 - 1 = 0 and fy(x0,y0)=20f_y(x_0, y_0) = \sqrt{2} \neq 0. The implicit function theorem gives a C1C^1 function gg on a neighbourhood of x0=2/2x_0 = \sqrt{2}/2 such that f(x,g(x))=0f(x, g(x)) = 0. The derivative formula yields g(x0)=fx(x0,y0)fy(x0,y0)=2x02y0=1.g'(x_0) = -\frac{f_x(x_0, y_0)}{f_y(x_0, y_0)} = -\frac{2 x_0}{2 y_0} = -1. This is the slope of the tangent line to the unit circle at (2/2,2/2)(\sqrt{2}/2, \sqrt{2}/2), as expected.

At the point (1,0)(1, 0) we have fy=0f_y = 0, so the theorem fails — and indeed near (1,0)(1, 0) the circle cannot be written as y=g(x)y = g(x) (two branches meet). However, fx(1,0)=20f_x(1, 0) = 2 \neq 0, so swapping the roles of xx and yy, the theorem gives x=h(y)x = h(y) near (1,0)(1, 0), with h(0)=fy/fx=0h'(0) = -f_y/f_x = 0 (a vertical tangent).

ExampleAn implicit curve without a closed form

Let f:R2Rf : \mathbb{R}^2 \to \mathbb{R} be f(x,y)=16(23arctan(2y13)+log((1+y)21y+y2))x3π+log6418.f(x, y) = \frac{1}{6}\left( 2\sqrt{3} \arctan\left( \frac{2y - 1}{\sqrt{3}} \right) + \log\left( \frac{(1 + y)^2}{1 - y + y^2} \right) \right) - x - \frac{\sqrt{3} \pi + \log 64}{18}. We take (x0,y0)=(0,1)(x_0, y_0) = (0, 1). Direct substitution (using arctan(1/3)=π/6\arctan(1/\sqrt{3}) = \pi/6 and log(4)=log4\log(4) = \log 4, log64=6log2\log 64 = 6 \log 2) gives f(0,1)=0f(0, 1) = 0. The partials are fx=1,fy=1y3+1.\frac{\partial f}{\partial x} = -1, \qquad \frac{\partial f}{\partial y} = \frac{1}{y^3 + 1}. Take D=R×(1,)D = \mathbb{R} \times (-1, \infty) so y3+1>0y^3 + 1 > 0; ff is C1C^1 on DD. At (0,1)(0, 1), f/y=1/20\partial f / \partial y = 1/2 \neq 0. By the implicit function theorem, there is r>0r > 0 and a C1C^1 function g:(r,r)Rg : (-r, r) \to \mathbb{R} with g(0)=1g(0) = 1 and f(x,g(x))=0f(x, g(x)) = 0. No closed form for gg is available, but the derivative formula yields g(x)=11/(g(x)3+1)=g(x)3+1.g'(x) = -\frac{-1}{1/(g(x)^3 + 1)} = g(x)^3 + 1. In particular g(0)=g(0)3+1=2g'(0) = g(0)^3 + 1 = 2. We have discovered that gg is the solution to the initial-value problem g(x)=g(x)3+1,g(0)=1.g'(x) = g(x)^3 + 1, \qquad g(0) = 1.

ExampleThe sphere in R^3

Let f(x,y,z)=x2+y2+z21f(x, y, z) = x^2 + y^2 + z^2 - 1, the unit sphere. Here n=2n = 2 and m=1m = 1: we want to solve for zz in terms of (x,y)(x, y). At any (x0,y0,z0)(x_0, y_0, z_0) on the sphere with z00z_0 \neq 0, f/z=2z00\partial f / \partial z = 2 z_0 \neq 0, so the implicit function theorem gives z=g(x,y)z = g(x, y) locally, with gx(x0,y0)=fxfz=x0z0,gy(x0,y0)=y0z0.\frac{\partial g}{\partial x}(x_0, y_0) = -\frac{f_x}{f_z} = -\frac{x_0}{z_0}, \qquad \frac{\partial g}{\partial y}(x_0, y_0) = -\frac{y_0}{z_0}. The upper hemisphere is globally z=1x2y2z = \sqrt{1 - x^2 - y^2}. The theorem fails on the equator {z=0}\{z = 0\}, and indeed the sphere is not locally a graph z=g(x,y)z = g(x, y) near equatorial points.

ExampleA system of two equations in three unknowns

Consider the system {f1(x,y1,y2)=x2+y12+y223=0,f2(x,y1,y2)=x+y1+y23=0,\begin{cases} f_1(x, y_1, y_2) = x^2 + y_1^2 + y_2^2 - 3 = 0, \\ f_2(x, y_1, y_2) = x + y_1 + y_2 - 3 = 0, \end{cases} at the point (x0,y10,y20)=(1,1,1)(x_0, y_1^0, y_2^0) = (1, 1, 1). Both equations are satisfied. Here n=1n = 1, m=2m = 2. The partial Jacobian in (y1,y2)(y_1, y_2) is B=(2y12y211)(1,1,1)=(2211),detB=0.B = \begin{pmatrix} 2 y_1 & 2 y_2 \\ 1 & 1 \end{pmatrix} \Bigg|_{(1,1,1)} = \begin{pmatrix} 2 & 2 \\ 1 & 1 \end{pmatrix}, \qquad \det B = 0. So the hypothesis fails: we cannot solve for both y1,y2y_1, y_2 in terms of xx alone near (1,1,1)(1,1,1). Geometrically, at this point the sphere and plane intersect tangentially rather than transversely. Near generic intersection points (e.g.\ the circle of intersection of the sphere and plane is generic away from its tangencies), BB is invertible and we can locally solve.


Applications

Parametrizing Curves and Surfaces

The implicit function theorem is the fundamental tool for turning implicit descriptions of geometric objects into local parametrizations:

  • Curves in R2\mathbb{R}^2 described by f(x,y)=0f(x, y) = 0: locally a graph y=g(x)y = g(x) or x=h(y)x = h(y) wherever f0\nabla f \neq 0.
  • Surfaces in R3\mathbb{R}^3 described by f(x,y,z)=0f(x, y, z) = 0: locally a graph z=g(x,y)z = g(x, y) (or similar) wherever f0\nabla f \neq 0.
  • Curves in R3\mathbb{R}^3 described as the intersection of two surfaces f1=f2=0f_1 = f_2 = 0: locally parametrized by one of the coordinates wherever the 2×22 \times 2 minor of (f1,f2)(\nabla f_1, \nabla f_2) with respect to the other two coordinates is invertible.

Preparing for Lagrange Multipliers

The Lagrange multiplier method for constrained optimization — minimize f(x)f(x) subject to g(x)=0g(x) = 0 — relies on the implicit function theorem. At a minimum xx^* of ff on the constraint set {g=0}\{g = 0\}, assuming g(x)0\nabla g(x^*) \neq 0, the implicit function theorem lets us parametrize the constraint set locally as a smooth surface, which lets us check that directional derivatives of ff along the tangent space vanish. This forces f(x)\nabla f(x^*) to be normal to the constraint surface, i.e.\ parallel to g(x)\nabla g(x^*), yielding the multiplier equation f=λg\nabla f = \lambda \nabla g. A full treatment is the subject of the next chapter.


The implicit function theorem is one of the most consequential theorems in analysis: it bridges algebra (solving equations) and geometry (level sets as manifolds), it underpins the inverse function theorem and Lagrange multipliers, it is the starting point of differential topology and differential geometry, and its proof via Banach's fixed-point theorem illustrates the deep power of contraction mappings — a strategy that recurs throughout analysis, from solving differential equations (Picard iteration) to the spectral theory of operators.