When can an implicit equation F(x, y) = 0 be solved locally for y as a function of x? The implicit function theorem gives sufficient conditions via the non-singular Jacobian. Inverse function theorem as a special case.
Given an equation F(x,y)=0 relating two variables, when can we solve for y as a function of x? Globally, rarely: the unit circle x2+y2−1=0 is not the graph of any single function y=g(x) because two values of y correspond to each x∈(−1,1). But locally — in a neighbourhood of a given point on the curve — the answer is often yes, provided one partial derivative does not vanish. The implicit function theorem makes this precise, in arbitrary dimensions, and does so constructively: it proves the existence of the local solution function g by applying the Banach fixed-point theorem to a contraction built from the data. It also gives an explicit formula for the derivative of g, even without a closed-form expression for g itself. Its special case, the inverse function theorem, is the local invertibility criterion for smooth maps. This chapter develops the theorem, proves it in full, and explores its geometric and computational consequences.
Motivation: The Circle
The unit circle C={(x,y)∈R2:x2+y2=1} is not globally a graph, but every point of C has a neighbourhood in which Cis the graph of a function of one of the variables:
Near any point with y>0: y=1−x2.
Near any point with y<0: y=−1−x2.
Near any point with x>0: x=1−y2.
Near any point with x<0: x=−1−y2.
At (±1,0) we cannot write y as a function of x, but we can write x as a function of y. The function f(x,y)=x2+y2−1 has partials fx=2x and fy=2y. The inability to solve for y in terms of x at (±1,0) is mirrored by fy=2y=0 there. The inability to solve for x in terms of y at (0,±1) is mirrored by fx=0 there. The implicit function theorem promotes this observation to a general principle.
Two-Dimensional Implicit Function Theorem
TheoremDini's Implicit Function Theorem (2D, 1877)
Let D⊆R2 be open and f:D→R of class C1 on D. Let (x0,y0)∈D satisfy
f(x0,y0)=0and∂y∂f(x0,y0)=0.
Then there exist r>0 and a C1 function g:(x0−r,x0+r)→R such that g(x0)=y0, and for every (x,y) with ∣x−x0∣<r and ∣y−y0∣<r,
f(x,y)=0⟺y=g(x).
Moreover, for every x∈(x0−r,x0+r),
∂x∂f(x,g(x))+∂y∂f(x,g(x))⋅g′(x)=0,so g′(x)=−∂f/∂y(x,g(x))∂f/∂x(x,g(x)).
Remark.
Intuition: The condition fy(x0,y0)=0 says the level curve {f=0} is not tangent to the y-axis at (x0,y0). Infinitesimally, the linearization fx(x0,y0)Δx+fy(x0,y0)Δy=0 can be solved for Δy in terms of Δx iff fy=0. The theorem says this infinitesimal solvability extends to a full nonlinear local solution. The derivative formula is obtained by applying the chain rule to the identity f(x,g(x))≡0 and solving for g′.
The 2D theorem follows from the higher-dimensional version proven below. We skip a direct proof and pass to the general case.
Preliminary Tools
The proof of the full implicit function theorem rests on two results we state and use.
TheoremMean Value Inequality for Vector-Valued Maps
Let U⊆Rm be open and F:U→Rm differentiable on U. Suppose the line segment [p,q]={p+t(q−p):0≤t≤1} is contained in U. Then
∥F(q)−F(p)∥≤M∥q−p∥,M=supx∈U∥DF(x)∥op,
where ∥L∥op=sup∥v∥=1∥Lv∥ is the operator norm.
Let Y⊆Rm be a non-empty closed subset, and let K:Y→Y be a contraction: there exists 0≤λ<1 such that for every y1,y2∈Y,
∥K(y1)−K(y2)∥≤λ∥y1−y2∥.
Then there exists a unique y∗∈Y with K(y∗)=y∗.
Remark.
Intuition: A contraction shrinks distances by a factor λ<1. Iterating it produces a Cauchy sequence (geometrically shrinking differences), which converges because Y is closed in the complete space Rm. The limit must be fixed because K is continuous, and it is unique because two fixed points would have to be at distance ≤λ⋅ their own distance — forcing them to coincide.
The General Implicit Function Theorem
TheoremImplicit Function Theorem
Let U⊆Rn+m=Rn×Rm be open, and let (x0,y0)∈U. Let f:U→Rm be of class C1 in a neighbourhood of (x0,y0), with f(x0,y0)=0. Write f=(f1,…,fm) and introduce the m×n and m×m matrices
A=(∂xj∂fi(x0,y0))i=1,…,mj=1,…,n,B=(∂yj∂fi(x0,y0))i,j=1,…,m.
Assume that B is invertible. Then there exist an open neighbourhood X of x0 in Rn, an open neighbourhood of y0 in Rm, and a C1 function g:X→Rm with g(x0)=y0, such that for (x,y) near (x0,y0),
f(x,y)=0⟺y=g(x).
Moreover, Dg(x):Rn→Rm is given by the matrix
−Bx−1Ax,where Ax=(∂xj∂fi(x,g(x))),Bx=(∂yj∂fi(x,g(x))).
Remark.
Intuition: Split variables into "inputs" x and "outputs to be solved for" y. The hypothesis "B is invertible" says the infinitesimal system AΔx+BΔy=0 is uniquely solvable for Δy: Δy=−B−1AΔx. The theorem lifts this to the nonlinear level: the map x↦y(x)=g(x) exists locally, is C1, and its derivative is the infinitesimal solution −B−1A (evaluated at the current point). The proof rewrites the equation f(x,y)=0 as a fixed-point problem for y at each x and applies the contraction mapping principle.
The Inverse Function Theorem
We first pause for some terminology, then state the one-variable warm-up, then pass to the general case.
DefinitionHomeomorphism and Diffeomorphism
Let U,V⊆Rn be open.
A map F:U→V is a homeomorphism if it is a continuous bijection with continuous inverse.
For r≥1, F is a Cr diffeomorphism if F is a bijection of class Cr whose inverse F−1 is also of class Cr.
F is a local Cr diffeomorphism at a∈U if there exist open neighbourhoods U0⊆U of a and V0⊆V of F(a) such that F∣U0:U0→V0 is a Cr diffeomorphism.
Remark.
Intuition: A homeomorphism preserves topological structure; a diffeomorphism preserves smooth structure. Local diffeomorphisms are the natural analogues of "locally invertible with differentiable inverse" for Rn.
The One-Variable Inverse Function Theorem
Theorem1D Inverse Function Theorem
Let J⊆R be an open interval, a∈J, and f:J→R be of class C1. If f′(a)=0 (equivalently, the linear map R∋v↦f′(a)v∈R is invertible), then f is locally invertible on some interval I⊆J containing a. The inverse f−1:f(I)→I is of class C1, and
(f−1)′(y)=f′(f−1(y))1for every y∈f(I).
In other words, f is a local C1 diffeomorphism at a.
Remark.
Intuition:f′(a)=0 means f is strictly monotonic near a, so it is injective near a; continuity of f then hands us a continuous inverse, and the formula for (f−1)′ follows by differentiating f(f−1(y))=y.
The Multivariable Version
The multivariable inverse function theorem is a special case of the implicit function theorem in which f(x,y)=F(y)−x, so solving f(x,y)=0 means inverting F.
TheoremInverse Function Theorem
Let V⊆Rm be open, F:V→Rm of class C1, and y0∈V with DF(y0) invertible. Set x0=F(y0). Then there exist an open neighbourhood W of x0 and an open neighbourhood V0⊆V of y0, and a C1 function F−1:W→V0 such that F−1(F(y))=y for y∈V0 and F(F−1(x))=x for x∈W. Moreover,
D(F−1)(x)=(DF(F−1(x)))−1.
Remark.
Intuition: If the derivative DF(y0) is invertible, then to first order F acts as an invertible linear map near y0. The theorem says this local linear invertibility lifts to local nonlinear invertibility. The derivative formula is the infinitesimal version of the inverse: the derivative of the inverse is the inverse of the derivative.
Remark.
An alternative construction (Lecture 27). The reduction "inverse function = implicit function with f(x,y)=F(y)−x" is often presented in the opposite direction: one starts with the target equation F(y)=x, defines an auxiliary function Φ:U×Rm→Rm by Φ(x,y)=F(y)−x, notes that Φ is C1 with Φ(x0,y0)=0 and ∂Φ/∂y(x0,y0)=DF(y0) invertible, and then extracts from the implicit function theorem a C1 map h on a neighbourhood W of x0 with Φ(x,h(x))=0, i.e. F(h(x))=x. So h is a right inverse of F. The chain rule applied to F∘h=idW gives DF(h(x))∘Dh(x)=I, so Dh(x) is invertible; applying the same argument to h (with derivative DF(h(x))−1) produces a right inverse g of h. Then F=F∘(h∘g)=(F∘h)∘g=g locally, so h is a two-sided inverse of F, i.e. h=F−1. This mirrors exactly how one proves the one-variable version.
CorollaryDerivative of the Inverse via Chain Rule
Suppose F:U→Rm is of class C1 and has a C1 inverse F−1 defined on a neighbourhood of b=F(a). Then
D(F−1)(b)=(DF(a))−1.
Remark.
Intuition: Once we know that F−1 exists and is differentiable, the derivative formula comes for free from the chain rule. The hard content of the inverse function theorem is the existence of F−1; the formula is an easy consequence.
Geometric Interpretation: Level Sets as Smooth Graphs
Let f:U→Rm with U⊆Rn+m open, f of class C1. The level set {(x,y)∈U:f(x,y)=0} is a subset of Rn+m. The implicit function theorem says that, at points where the partial Jacobian with respect to the last m variables is invertible, the level set is locally the graph of a C1 functiong:Rn→Rm. Such a subset of Rn+m — locally the graph of a smooth function — is called a smooth n-dimensional submanifold of Rn+m.
More generally, given F:U→Rm with U⊆RN, if DF(p) has full rank m at some point p with F(p)=0, then there is a choice of m coordinates in RN whose partials give an invertible m×m submatrix; after reindexing, the implicit function theorem applies and the level set {F=0} is locally the graph of a C1 function of the remaining N−m=n coordinates. Maps F whose derivative has rank m at every point of F−1(0) are called submersions, and the theorem says:
The zero level set of a submersion is a smooth manifold of dimension N−m.
This is the bedrock of differential geometry.
Worked Examples
ExampleThe unit circle, revisited
Let f(x,y)=x2+y2−1. Then fx=2x and fy=2y. Consider the point (x0,y0)=(2/2,2/2). We have f(x0,y0)=1/2+1/2−1=0 and fy(x0,y0)=2=0. The implicit function theorem gives a C1 function g on a neighbourhood of x0=2/2 such that f(x,g(x))=0. The derivative formula yields
g′(x0)=−fy(x0,y0)fx(x0,y0)=−2y02x0=−1.
This is the slope of the tangent line to the unit circle at (2/2,2/2), as expected.
At the point (1,0) we have fy=0, so the theorem fails — and indeed near (1,0) the circle cannot be written as y=g(x) (two branches meet). However, fx(1,0)=2=0, so swapping the roles of x and y, the theorem gives x=h(y) near (1,0), with h′(0)=−fy/fx=0 (a vertical tangent).
ExampleAn implicit curve without a closed form
Let f:R2→R be
f(x,y)=61(23arctan(32y−1)+log(1−y+y2(1+y)2))−x−183π+log64.
We take (x0,y0)=(0,1). Direct substitution (using arctan(1/3)=π/6 and log(4)=log4, log64=6log2) gives f(0,1)=0. The partials are
∂x∂f=−1,∂y∂f=y3+11.
Take D=R×(−1,∞) so y3+1>0; f is C1 on D. At (0,1), ∂f/∂y=1/2=0. By the implicit function theorem, there is r>0 and a C1 function g:(−r,r)→R with g(0)=1 and f(x,g(x))=0. No closed form for g is available, but the derivative formula yields
g′(x)=−1/(g(x)3+1)−1=g(x)3+1.
In particular g′(0)=g(0)3+1=2. We have discovered that g is the solution to the initial-value problem
g′(x)=g(x)3+1,g(0)=1.
ExampleThe sphere in R^3
Let f(x,y,z)=x2+y2+z2−1, the unit sphere. Here n=2 and m=1: we want to solve for z in terms of (x,y). At any (x0,y0,z0) on the sphere with z0=0, ∂f/∂z=2z0=0, so the implicit function theorem gives z=g(x,y) locally, with
∂x∂g(x0,y0)=−fzfx=−z0x0,∂y∂g(x0,y0)=−z0y0.
The upper hemisphere is globally z=1−x2−y2. The theorem fails on the equator {z=0}, and indeed the sphere is not locally a graph z=g(x,y) near equatorial points.
ExampleA system of two equations in three unknowns
Consider the system
{f1(x,y1,y2)=x2+y12+y22−3=0,f2(x,y1,y2)=x+y1+y2−3=0,
at the point (x0,y10,y20)=(1,1,1). Both equations are satisfied. Here n=1, m=2. The partial Jacobian in (y1,y2) is
B=(2y112y21)(1,1,1)=(2121),detB=0.
So the hypothesis fails: we cannot solve for both y1,y2 in terms of x alone near (1,1,1). Geometrically, at this point the sphere and plane intersect tangentially rather than transversely. Near generic intersection points (e.g.\ the circle of intersection of the sphere and plane is generic away from its tangencies), B is invertible and we can locally solve.
Applications
Parametrizing Curves and Surfaces
The implicit function theorem is the fundamental tool for turning implicit descriptions of geometric objects into local parametrizations:
Curves in R2 described by f(x,y)=0: locally a graph y=g(x) or x=h(y) wherever ∇f=0.
Surfaces in R3 described by f(x,y,z)=0: locally a graph z=g(x,y) (or similar) wherever ∇f=0.
Curves in R3 described as the intersection of two surfaces f1=f2=0: locally parametrized by one of the coordinates wherever the 2×2 minor of (∇f1,∇f2) with respect to the other two coordinates is invertible.
Preparing for Lagrange Multipliers
The Lagrange multiplier method for constrained optimization — minimize f(x) subject to g(x)=0 — relies on the implicit function theorem. At a minimum x∗ of f on the constraint set {g=0}, assuming ∇g(x∗)=0, the implicit function theorem lets us parametrize the constraint set locally as a smooth surface, which lets us check that directional derivatives of f along the tangent space vanish. This forces ∇f(x∗) to be normal to the constraint surface, i.e.\ parallel to ∇g(x∗), yielding the multiplier equation ∇f=λ∇g. A full treatment is the subject of the next chapter.
The implicit function theorem is one of the most consequential theorems in analysis: it bridges algebra (solving equations) and geometry (level sets as manifolds), it underpins the inverse function theorem and Lagrange multipliers, it is the starting point of differential topology and differential geometry, and its proof via Banach's fixed-point theorem illustrates the deep power of contraction mappings — a strategy that recurs throughout analysis, from solving differential equations (Picard iteration) to the spectral theory of operators.