Skip to content

Partial Derivatives

1.1 Definition

Let f:DRnRf : D \subseteq \mathbb{R}^n \to \mathbb{R}. The partial derivative of ff with respect to xix_i at a=(a1,,an)\mathbf{a} = (a_1, \ldots, a_n) is

fxi(a)=limh0f(a1,,ai+h,,an)f(a1,,an)h\frac{\partial f}{\partial x_i}(\mathbf{a}) = \lim_{h \to 0} \frac{f(a_1, \ldots, a_i + h, \ldots, a_n) - f(a_1, \ldots, a_n)}{h}

Provided the limit exists. This is the rate of change of ff in the direction of the xix_i-axis, Holding all other variables fixed.

Notation. Common notations for the partial derivative with respect to xix_i include fxif_{x_i}, if\partial_i fAnd fxi\frac{\partial f}{\partial x_i}. We use these interchangeably.

1.2 Clairaut”s Theorem

Theorem 1.1 (Clairaut’s Theorem / Schwarz’s Theorem). If fxyf_{xy} and fyxf_{yx} are continuous on an Open set containing (a,b)(a, b)Then

2fxy(a,b)=2fyx(a,b)\frac{\partial^2 f}{\partial x \partial y}(a,b) = \frac{\partial^2 f}{\partial y \partial x}(a,b)

Proof. Define the second-order difference function

Δ(h,k)=f(a+h,b+k)f(a+h,b)f(a,b+k)+f(a,b)\Delta(h, k) = f(a+h,\, b+k) - f(a+h,\, b) - f(a,\, b+k) + f(a, b)

For h,k0h, k \neq 0. Define ϕ(s)=f(s,b+k)f(s,b)\phi(s) = f(s, b+k) - f(s, b). Then Δ(h,k)=ϕ(a+h)ϕ(a)\Delta(h,k) = \phi(a+h) - \phi(a). By the Mean Value Theorem, there exists θ1(0,1)\theta_1 \in (0, 1) such that

Δ(h,k)=hϕ(a+θ1h)=h[fx(a+θ1h,b+k)fx(a+θ1h,b)]\Delta(h, k) = h \cdot \phi'(a + \theta_1 h) = h \left[f_x(a + \theta_1 h,\, b+k) - f_x(a + \theta_1 h,\, b)\right]

Apply the Mean Value Theorem again to the function g(t)=fx(a+θ1h,t)g(t) = f_x(a + \theta_1 h,\, t) on [b,b+k][b, b+k]. There exists θ2(0,1)\theta_2 \in (0, 1) such that

Δ(h,k)=hkfxy(a+θ1h,b+θ2k)\Delta(h, k) = hk \cdot f_{xy}(a + \theta_1 h,\, b + \theta_2 k)

Similarly, by reversing the order of application, there exist θ3,θ4(0,1)\theta_3, \theta_4 \in (0,1) such That

Δ(h,k)=hkfyx(a+θ3h,b+θ4k)\Delta(h, k) = hk \cdot f_{yx}(a + \theta_3 h,\, b + \theta_4 k)

For h,k0h, k \neq 0 we have

fxy(a+θ1h,b+θ2k)=fyx(a+θ3h,b+θ4k)f_{xy}(a + \theta_1 h,\, b + \theta_2 k) = f_{yx}(a + \theta_3 h,\, b + \theta_4 k)

Taking the limit as (h,k)(0,0)(h, k) \to (0, 0) and using continuity of fxyf_{xy} and fyxf_{yx}We obtain fxy(a,b)=fyx(a,b)f_{xy}(a, b) = f_{yx}(a, b). \blacksquare

Intuition. Clairaut’s theorem tells us that, under a mild regularity condition (continuity of the Mixed second partials), the order in which we differentiate does not matter. Without this Condition, the mixed partials may differ.

1.3 Differentiability

Definition. f:DRnRf : D \subseteq \mathbb{R}^n \to \mathbb{R} is differentiable at a\mathbf{a} if There exists a linear map L:RnRL : \mathbb{R}^n \to \mathbb{R} such that

limh0f(a+h)f(a)L(h)h=0\lim_{\mathbf{h} \to \mathbf{0}} \frac{f(\mathbf{a} + \mathbf{h}) - f(\mathbf{a}) - L(\mathbf{h})}{\lVert \mathbf{h} \rVert} = 0

When ff is differentiable at a\mathbf{a}The linear map LL is given by the gradient.

Remark. Existence of all partial derivatives at a point does not imply differentiability at That point. The canonical counterexample is

f(x,y)={xyx2+y2if (x,y)(0,0),0if (x,y)=(0,0).f(x,y) = \begin{cases} \dfrac{xy}{x^2 + y^2} & \mathrm{if\ }(x,y) \neq (0,0), \\ 0 & \mathrm{if\ }(x,y) = (0,0). \end{cases}

Both fx(0,0)f_x(0,0) and fy(0,0)f_y(0,0) exist (and equal 00), yet ff is not even continuous at the origin, Hence not differentiable.

1.4 The Gradient

The gradient of ff at a\mathbf{a} is

f(a)=(fx1(a),,fxn(a))\nabla f(\mathbf{a}) = \left(\frac{\partial f}{\partial x_1}(\mathbf{a}), \ldots, \frac{\partial f}{\partial x_n}(\mathbf{a})\right)

The linear approximation of ff near a\mathbf{a} is

f(a+h)f(a)+f(a)hf(\mathbf{a} + \mathbf{h}) \approx f(\mathbf{a}) + \nabla f(\mathbf{a}) \cdot \mathbf{h}

Theorem 1.2. If all partial derivatives of ff exist and are continuous in a neighbourhood of a\mathbf{a}Then ff is differentiable at a\mathbf{a}.

Remark. Functions whose partial derivatives exist and are continuous on an open set UU are called C1(U)C^1(U). Theorem 1.2 says C1    C^1 \implies differentiable. The converse is false: there exist Differentiable functions whose partial derivatives are not continuous.

Proposition. If ff is differentiable at a\mathbf{a}Then ff is continuous at a\mathbf{a}.

Proof. From the definition of differentiability:

f(a+h)f(a)=L(h)+ε(h)hf(\mathbf{a} + \mathbf{h}) - f(\mathbf{a}) = L(\mathbf{h}) + \varepsilon(\mathbf{h})\lVert \mathbf{h} \rVert

Where LL is linear and ε(h)0\varepsilon(\mathbf{h}) \to 0 as h0\mathbf{h} \to \mathbf{0}. As h0\mathbf{h} \to \mathbf{0} Both terms on the right vanish, so f(a+h)f(a)f(\mathbf{a} + \mathbf{h}) \to f(\mathbf{a}). \blacksquare

1.5 Directional Derivatives

The directional derivative of ff at a\mathbf{a} in the direction of a unit vector u\mathbf{u} is

Duf(a)=limh0f(a+hu)f(a)hD_{\mathbf{u}} f(\mathbf{a}) = \lim_{h \to 0} \frac{f(\mathbf{a} + h\mathbf{u}) - f(\mathbf{a})}{h}

Theorem 1.3. If ff is differentiable at a\mathbf{a}Then

Duf(a)=f(a)uD_{\mathbf{u}} f(\mathbf{a}) = \nabla f(\mathbf{a}) \cdot \mathbf{u}

Proof. Since ff is differentiable at a\mathbf{a}

f(a+hu)f(a)h=f(a)(hu)+ε(hu)huh\frac{f(\mathbf{a} + h\mathbf{u}) - f(\mathbf{a})}{h} = \frac{\nabla f(\mathbf{a}) \cdot (h\mathbf{u}) + \varepsilon(h\mathbf{u}) \lVert h\mathbf{u} \rVert}{h}

=f(a)u+ε(hu)u= \nabla f(\mathbf{a}) \cdot \mathbf{u} + \varepsilon(h\mathbf{u}) \lVert \mathbf{u} \rVert

Where ε(h)0\varepsilon(\mathbf{h}) \to 0 as h0\mathbf{h} \to \mathbf{0}. Taking h0h \to 0 gives the result. \blacksquare

Corollary 1.4. The gradient points in the direction of steepest ascent, and f\lVert \nabla f \rVert Is the rate of steepest ascent.

Proof. By the Cauchy—Schwarz inequality, fufu=f\lvert \nabla f \cdot \mathbf{u} \rvert \leq \lVert \nabla f \rVert \cdot \lVert \mathbf{u} \rVert = \lVert \nabla f \rVert With equality when u\mathbf{u} is parallel to f\nabla f. \blacksquare

1.6 Chain Rule

Theorem 1.5 (Multivariable Chain Rule). If g:RmRn\mathbf{g} : \mathbb{R}^m \to \mathbb{R}^n is Differentiable at a\mathbf{a} and f:RnRf : \mathbb{R}^n \to \mathbb{R} is differentiable at g(a)\mathbf{g}(\mathbf{a})Then

(fg)(a)=Jg(a)Tf(g(a))\nabla (f \circ \mathbf{g})(\mathbf{a}) = J\mathbf{g}(\mathbf{a})^T \nabla f(\mathbf{g}(\mathbf{a}))

Where JgJ\mathbf{g} is the Jacobian matrix of g\mathbf{g}.

Proof. Write h(t)=f(g(a+tv))h(t) = f(\mathbf{g}(\mathbf{a} + t\mathbf{v})) for a fixed direction v\mathbf{v}. Then

h(t)h(0)t=f(g(a+tv))f(g(a))t\frac{h(t) - h(0)}{t} = \frac{f(\mathbf{g}(\mathbf{a} + t\mathbf{v})) - f(\mathbf{g}(\mathbf{a}))}{t}

Let k=g(a+tv)g(a)\mathbf{k} = \mathbf{g}(\mathbf{a} + t\mathbf{v}) - \mathbf{g}(\mathbf{a}). By differentiability of g\mathbf{g} k=Jg(a)(tv)+o(t)\mathbf{k} = J\mathbf{g}(\mathbf{a})(t\mathbf{v}) + o(t)And k0\mathbf{k} \to \mathbf{0} as t0t \to 0. By Differentiability of ff:

f(g(a)+k)f(g(a))=f(g(a))k+o(k)f(\mathbf{g}(\mathbf{a}) + \mathbf{k}) - f(\mathbf{g}(\mathbf{a})) = \nabla f(\mathbf{g}(\mathbf{a})) \cdot \mathbf{k} + o(\lVert \mathbf{k} \rVert)

=f(g(a))[Jg(a)(tv)+o(t)]+o(t)= \nabla f(\mathbf{g}(\mathbf{a})) \cdot [J\mathbf{g}(\mathbf{a})(t\mathbf{v}) + o(t)] + o(t)

Dividing by tt and taking t0t \to 0:

h(0)=f(g(a))Jg(a)v=[Jg(a)Tf(g(a))]vh'(0) = \nabla f(\mathbf{g}(\mathbf{a})) \cdot J\mathbf{g}(\mathbf{a})\mathbf{v} = [J\mathbf{g}(\mathbf{a})^T \nabla f(\mathbf{g}(\mathbf{a}))] \cdot \mathbf{v}

Since v\mathbf{v} was arbitrary, h(0)=Jg(a)Tf(g(a))\nabla h(0) = J\mathbf{g}(\mathbf{a})^T \nabla f(\mathbf{g}(\mathbf{a})). \blacksquare

1.7 Chain Rule Worked Example

Problem. Let f(x,y)=x2yf(x, y) = x^2 y and let x=costx = \cos t, y=sinty = \sin t. Find ddtf(cost,sint)\frac{d}{dt} f(\cos t, \sin t) Using the chain rule, and verify by direct substitution.

Solution

Via the chain rule:

ddtf(x(t),y(t))=fxx(t)+fyy(t)\frac{d}{dt} f(x(t), y(t)) = f_x \cdot x'(t) + f_y \cdot y'(t)

=2xy(sint)+x2cost=2costsin2t+cos3t= 2xy \cdot (-\sin t) + x^2 \cdot \cos t = -2\cos t \sin^2 t + \cos^3 t

Via direct substitution: f(cost,sint)=cos2tsintf(\cos t, \sin t) = \cos^2 t \sin t.

ddt[cos2tsint]=2costsin2t+cos3t\frac{d}{dt}[\cos^2 t \sin t] = -2\cos t \sin^2 t + \cos^3 t

Both methods agree. \blacksquare

1.8 Worked Example

Problem. Let f(x,y)=x2y+sin(xy)f(x, y) = x^2 y + \sin(xy). Compute f\nabla f and find the directional derivative At (1,π)(1, \pi) in the direction u=(1/2,1/2)\mathbf{u} = (1/\sqrt{2}, 1/\sqrt{2}).

Solution.

fx=2xy+ycos(xy)\frac{\partial f}{\partial x} = 2xy + y\cos(xy)

fy=x2+xcos(xy)\frac{\partial f}{\partial y} = x^2 + x\cos(xy)

f(1,π)=(2π+πcos(π),1+cos(π))=(2ππ,11)=(π,0)\nabla f(1, \pi) = (2\pi + \pi\cos(\pi), 1 + \cos(\pi)) = (2\pi - \pi, 1 - 1) = (\pi, 0)

Duf(1,π)=f(1,π)u=π12+0=π2D_{\mathbf{u}} f(1, \pi) = \nabla f(1, \pi) \cdot \mathbf{u} = \pi \cdot \frac{1}{\sqrt{2}} + 0 = \frac{\pi}{\sqrt{2}} \blacksquare

1.9 Additional Worked Examples

Problem. Let f(x,y,z)=x2yez+sin(xz)f(x, y, z) = x^2 y\, e^z + \sin(xz). Compute f\nabla f and evaluate it at (1,0,π)(1, 0, \pi).

Solution

fx=2xyez+zcos(xz)\frac{\partial f}{\partial x} = 2xy\, e^z + z\cos(xz)

fy=x2ez\frac{\partial f}{\partial y} = x^2 e^z

fz=x2yez+xcos(xz)\frac{\partial f}{\partial z} = x^2 y\, e^z + x\cos(xz)

At (1,0,π)(1, 0, \pi):

fx(1,0,π)=0+πcos(π)=π,fy(1,0,π)=eπ,fz(1,0,π)=0+cos(π)=1f_x(1,0,\pi) = 0 + \pi\cos(\pi) = -\pi, \quad f_y(1,0,\pi) = e^{\pi}, \quad f_z(1,0,\pi) = 0 + \cos(\pi) = -1

f(1,0,π)=(π,eπ,1)\nabla f(1, 0, \pi) = (-\pi,\, e^{\pi},\, -1)

\blacksquare

Problem. Find the directional derivative of f(x,y)=x2y3f(x,y) = x^2 y^3 at (1,1)(1, -1) in the direction of v=(3,4)\mathbf{v} = (3, -4).

Solution

First normalise v\mathbf{v}: v=9+16=5\lVert \mathbf{v} \rVert = \sqrt{9 + 16} = 5So u=(3/5,4/5)\mathbf{u} = (3/5,\, -4/5).

f=(2xy3,3x2y2)\nabla f = (2xy^3,\, 3x^2 y^2)

f(1,1)=(21(1),311)=(2,3)\nabla f(1, -1) = (2 \cdot 1 \cdot (-1),\, 3 \cdot 1 \cdot 1) = (-2, 3)

Duf(1,1)=(2)(3/5)+(3)(4/5)=6125=185D_{\mathbf{u}} f(1, -1) = (-2)(3/5) + (3)(-4/5) = \frac{-6 - 12}{5} = -\frac{18}{5}

\blacksquare

1.10 Implicit Differentiation

Suppose F(x,y,z)=0F(x, y, z) = 0 defines zz implicitly as a function of xx and yy near a point (a,b,c)(a, b, c) with Fz(a,b,c)0F_z(a, b, c) \neq 0. By the Implicit Function Theorem, there exists a C1C^1 function φ\varphi defined on a neighbourhood of (a,b)(a, b) such that φ(a,b)=c\varphi(a, b) = c and F(x,y,φ(x,y))=0F(x, y, \varphi(x, y)) = 0.

Differentiating F(x,y,φ(x,y))=0F(x, y, \varphi(x, y)) = 0 with respect to xx:

Fx+Fzzx=0    zx=FxFzF_x + F_z \cdot \frac{\partial z}{\partial x} = 0 \implies \frac{\partial z}{\partial x} = -\frac{F_x}{F_z}

Similarly, zy=FyFz\frac{\partial z}{\partial y} = -\frac{F_y}{F_z}.

Proposition 1.6 (Implicit Function Theorem, special case). If F:R3RF : \mathbb{R}^3 \to \mathbb{R} is C1C^1 and F(a,b,c)=0F(a,b,c) = 0 with Fz(a,b,c)0F_z(a,b,c) \neq 0Then there exist neighbourhoods UU of (a,b)(a,b) and VV of cc and a unique C1C^1 function φ:UV\varphi : U \to V with φ(a,b)=c\varphi(a,b) = c and F(x,y,φ(x,y))=0F(x, y, \varphi(x,y)) = 0 for all (x,y)U(x,y) \in U.

Problem. If x2y+y2z+z2x=3x^2 y + y^2 z + z^2 x = 3Find zx\frac{\partial z}{\partial x} and zy\frac{\partial z}{\partial y} at the point (1,1,1)(1, 1, 1).

Solution

Let F(x,y,z)=x2y+y2z+z2x3F(x,y,z) = x^2 y + y^2 z + z^2 x - 3. Then Fx=2xy+z2F_x = 2xy + z^2 Fy=x2+2yzF_y = x^2 + 2yz, Fz=y2+2zxF_z = y^2 + 2zx.

At (1,1,1)(1,1,1): Fx=3F_x = 3, Fy=3F_y = 3, Fz=3F_z = 3.

zx=FxFz=33=1,zy=FyFz=33=1\frac{\partial z}{\partial x} = -\frac{F_x}{F_z} = -\frac{3}{3} = -1, \quad \frac{\partial z}{\partial y} = -\frac{F_y}{F_z} = -\frac{3}{3} = -1

\blacksquare

1.11 Taylor’s Theorem for Multivariable Functions

Theorem 1.7 (Taylor’s Theorem). Let f:URnRf : U \subseteq \mathbb{R}^n \to \mathbb{R} be of class Ck+1C^{k+1} On an open convex set UUAnd let aU\mathbf{a} \in U. Then for all xU\mathbf{x} \in U:

f(x)=f(a)+f(a)(xa)+12!(xa)THf(a)(xa)++Rkf(\mathbf{x}) = f(\mathbf{a}) + \nabla f(\mathbf{a}) \cdot (\mathbf{x} - \mathbf{a}) + \frac{1}{2!}(\mathbf{x} - \mathbf{a})^T H_f(\mathbf{a})(\mathbf{x} - \mathbf{a}) + \cdots + R_k

Where HfH_f is the Hessian matrix and the remainder RkR_k can be written in Lagrange form:

Rk=1(k+1)!α=k+1(k+1)!α!Dαf(c)(xa)αR_k = \frac{1}{(k+1)!} \sum_{\lvert \alpha \rvert = k+1} \frac{(k+1)!}{\alpha!} D^{\alpha} f(\mathbf{c})\, (\mathbf{x} - \mathbf{a})^{\alpha}

For some c\mathbf{c} on the line segment joining a\mathbf{a} and x\mathbf{x}.

For n=2n = 2 and k=2k = 2The second-order Taylor expansion is:

f(a+h,b+k)=f(a,b)+fxh+fyk+12(fxxh2+2fxyhk+fyyk2)+R2f(a+h, b+k) = f(a,b) + f_x h + f_y k + \frac{1}{2}\left(f_{xx} h^2 + 2f_{xy} hk + f_{yy} k^2\right) + R_2

Where all partial derivatives are evaluated at (a,b)(a, b) and the remainder is

R2=16(fxxxh3+3fxxyh2k+3fxyyhk2+fyyyk3)cR_2 = \frac{1}{6}\left(f_{xxx} h^3 + 3f_{xxy} h^2 k + 3f_{xyy} hk^2 + f_{yyy} k^3\right)\Big|_{\mathbf{c}}

Proof (sketch). Define ϕ(t)=f(a+t(xa))\phi(t) = f(\mathbf{a} + t(\mathbf{x} - \mathbf{a})) for t[0,1]t \in [0, 1]. Apply the single-variable Taylor theorem to ϕ\phi at t=0t = 0:

ϕ(1)=ϕ(0)+ϕ(0)+12!ϕ(0)++1k!ϕ(k)(0)+1(k+1)!ϕ(k+1)(τ)\phi(1) = \phi(0) + \phi'(0) + \frac{1}{2!}\phi''(0) + \cdots + \frac{1}{k!}\phi^{(k)}(0) + \frac{1}{(k+1)!}\phi^{(k+1)}(\tau)

For some τ(0,1)\tau \in (0, 1). By the multivariable chain rule, ϕ(t)=f(a+t(xa))(xa)\phi'(t) = \nabla f(\mathbf{a} + t(\mathbf{x}-\mathbf{a})) \cdot (\mathbf{x}-\mathbf{a})And higher Derivatives involve higher-order partial derivatives of ff. Substituting c=a+τ(xa)\mathbf{c} = \mathbf{a} + \tau(\mathbf{x}-\mathbf{a}) yields the result. \blacksquare

1.12 Common Pitfalls

:::caution Common Pitfalls

  • Existence \neq continuity of partials. A function can have all partial derivatives at a point yet fail to be continuous (hence not differentiable) there.
  • Existence \neq differentiability. Even if all partials exist at a point, the function need not be differentiable. Continuity of the partials in a neighbourhood (i.e., C1C^1) is sufficient but not necessary.
  • Clairaut’s theorem requires continuity. Without continuity of the mixed partials, the equality fxy=fyxf_{xy} = f_{yx} can fail.
  • Normalise the direction vector. The formula Duf=fuD_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} assumes u=1\lVert \mathbf{u} \rVert = 1. If the direction is given by a non-unit vector v\mathbf{v}Divide by v\lVert \mathbf{v} \rVert first.

:::