Skip to content

Optimization

4.1 Local Extrema

Theorem 4.1 (First Derivative Test). If ff has a local extremum at an interior point a\mathbf{a} And f(a)\nabla f(\mathbf{a}) exists, then f(a)=0\nabla f(\mathbf{a}) = \mathbf{0}.

Points where f=0\nabla f = \mathbf{0} are called critical points (or stationary points).

Remark. Not all critical points are extrema. A critical point can be a local minimum, local maximum, Or saddle point. The second derivative test (Section 4.2) distinguishes these cases.

4.2 Second Derivative Test

Theorem 4.2 (Second Derivative Test). Let ff have continuous second partial derivatives near a Critical point (a,b)(a,b) with fx(a,b)=fy(a,b)=0f_x(a,b) = f_y(a,b) = 0. Let

D=fxx(a,b)fyy(a,b)[fxy(a,b)]2D = f_{xx}(a,b) f_{yy}(a,b) - [f_{xy}(a,b)]^2

Be the Hessian determinant. Then:

  • If D>0D \gt 0 and fxx(a,b)>0f_{xx}(a,b) \gt 0: local minimum.
  • If D>0D \gt 0 and fxx(a,b)<0f_{xx}(a,b) \lt 0: local maximum.
  • If D<0D \lt 0: saddle point.
  • If D=0D = 0: the test is inconclusive.

Proof. By Taylor”s theorem to second order, for small h,kh, k:

f(a+h,b+k)f(a,b)=12[fxxh2+2fxyhk+fyyk2]+R2f(a+h, b+k) - f(a,b) = \frac{1}{2}\left[f_{xx} h^2 + 2f_{xy} hk + f_{yy} k^2\right] + R_2

Where the remainder R2=o(h2+k2)R_2 = o(h^2 + k^2) and all partials are evaluated at (a,b)(a,b). The sign of the Right-hand side is determined by the quadratic form

Q(h,k)=fxxh2+2fxyhk+fyyk2=(hk)H(hk)Q(h,k) = f_{xx} h^2 + 2f_{xy} hk + f_{yy} k^2 = \begin{pmatrix} h & k \end{pmatrix} H \begin{pmatrix} h \\ k \end{pmatrix}

Where H=(fxxfxyfxyfyy)H = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{xy} & f_{yy} \end{pmatrix} is the Hessian matrix.

By Sylvester’s criterion for 2×22 \times 2 symmetric matrices:

  • If det(H)=D>0\det(H) = D \gt 0 and fxx>0f_{xx} \gt 0Then HH is positive definite, so Q>0Q \gt 0 for all (h,k)(0,0)(h,k) \neq (0,0)Giving a local minimum.
  • If det(H)=D>0\det(H) = D \gt 0 and fxx<0f_{xx} \lt 0Then HH is negative definite, so Q<0Q \lt 0 for all (h,k)(0,0)(h,k) \neq (0,0)Giving a local maximum.
  • If det(H)=D<0\det(H) = D \lt 0Then HH is indefinite, so QQ takes both positive and negative values, giving a saddle point.

When D=0D = 0The quadratic form is degenerate and the sign is determined by higher-order terms. \blacksquare

4.3 Lagrange Multipliers

Theorem 4.3 (Method of Lagrange Multipliers). To find the extrema of f(x,y,z)f(x,y,z) subject to the Constraint g(x,y,z)=0g(x,y,z) = 0Solve the system:

f=λg,g=0\nabla f = \lambda \nabla g, \quad g = 0

More generally, for kk constraints g1=0,,gk=0g_1 = 0, \ldots, g_k = 0:

f=λ1g1++λkgk\nabla f = \lambda_1 \nabla g_1 + \cdots + \lambda_k \nabla g_k

Proof (single constraint, geometric justification). Let M=(x,y,z):g(x,y,z)=0M = \\{(x,y,z) : g(x,y,z) = 0\\} be the constraint surface. If ff has a local extremum on MM at p\mathbf{p}Then the directional derivative Dvf(p)=0D_{\mathbf{v}} f(\mathbf{p}) = 0 for every tangent Vector v\mathbf{v} to MM at p\mathbf{p}. Since f(p)v=0\nabla f(\mathbf{p}) \cdot \mathbf{v} = 0 for all Such v\mathbf{v}The gradient f(p)\nabla f(\mathbf{p}) must be orthogonal to the tangent space of MM At p\mathbf{p}. But the tangent space of MM is orthogonal to g(p)\nabla g(\mathbf{p}) (by the implicit Function theorem). Therefore f(p)\nabla f(\mathbf{p}) must be parallel to g(p)\nabla g(\mathbf{p})I.e., f(p)=λg(p)\nabla f(\mathbf{p}) = \lambda\, \nabla g(\mathbf{p}) for some scalar λ\lambda. \blacksquare

4.4 Worked Example

Problem. Find the maximum of f(x,y)=xyf(x,y) = xy subject to x2+y2=1x^2 + y^2 = 1.

Solution. Set g(x,y)=x2+y21g(x,y) = x^2 + y^2 - 1. The Lagrange multiplier equations:

f=λg    (y,x)=λ(2x,2y)\nabla f = \lambda \nabla g \implies (y, x) = \lambda(2x, 2y)

This gives y=2λxy = 2\lambda x and x=2λyx = 2\lambda y. Multiplying: xy=4λ2xyxy = 4\lambda^2 xy.

Case 1: xy0xy \neq 0. Then 4λ2=14\lambda^2 = 1So λ=±1/2\lambda = \pm 1/2.

  • λ=1/2\lambda = 1/2: y=xy = xAnd x2+x2=1x^2 + x^2 = 1So x=±1/2x = \pm 1/\sqrt{2}. Points: (1/2,1/2)(1/\sqrt{2}, 1/\sqrt{2}) and (1/2,1/2)(-1/\sqrt{2}, -1/\sqrt{2}) with f=1/2f = 1/2.
  • λ=1/2\lambda = -1/2: y=xy = -xAnd x2+x2=1x^2 + x^2 = 1So x=±1/2x = \pm 1/\sqrt{2}. Points: (1/2,1/2)(1/\sqrt{2}, -1/\sqrt{2}) and (1/2,1/2)(-1/\sqrt{2}, 1/\sqrt{2}) with f=1/2f = -1/2.

Case 2: xy=0xy = 0. Then either x=0x = 0 or y=0y = 0. From the constraint: (0,±1)(0, \pm 1) or (±1,0)(\pm 1, 0) with f=0f = 0.

Maximum: f=1/2f = 1/2 at (±1/2,±1/2)(\pm 1/\sqrt{2}, \pm 1/\sqrt{2}). Minimum: f=1/2f = -1/2 at (±1/2,1/2)(\pm 1/\sqrt{2}, \mp 1/\sqrt{2}). \blacksquare

4.5 Additional Worked Examples

Problem. Find and classify all critical points of f(x,y)=x4+y44xyf(x,y) = x^4 + y^4 - 4xy.

Solution

Compute the gradient:

f=(4x34y,4y34x)\nabla f = (4x^3 - 4y,\, 4y^3 - 4x)

Set f=(0,0)\nabla f = (0,0):

x3=y,y3=xx^3 = y, \quad y^3 = x

Substituting y=x3y = x^3 into y3=xy^3 = x: (x3)3=x(x^3)^3 = xI.e., x9=xx^9 = xGiving x(x81)=0x(x^8 - 1) = 0. So x=0x = 0 or x=±1x = \pm 1.

  • x=0x = 0: y=0y = 0. Critical point: (0,0)(0, 0).
  • x=1x = 1: y=1y = 1. Critical point: (1,1)(1, 1).
  • x=1x = -1: y=1y = -1. Critical point: (1,1)(-1, -1).

Second derivatives: fxx=12x2f_{xx} = 12x^2, fyy=12y2f_{yy} = 12y^2, fxy=4f_{xy} = -4.

At (0,0)(0,0): D=0016=16<0D = 0 \cdot 0 - 16 = -16 \lt 0. Saddle point.

At (1,1)(1,1): D=121216=14416=128>0D = 12 \cdot 12 - 16 = 144 - 16 = 128 \gt 0 and fxx=12>0f_{xx} = 12 \gt 0. Local minimum with f(1,1)=1+14=2f(1,1) = 1 + 1 - 4 = -2.

At (1,1)(-1,-1): D=121216=128>0D = 12 \cdot 12 - 16 = 128 \gt 0 and fxx=12>0f_{xx} = 12 \gt 0. Local minimum with f(1,1)=1+14=2f(-1,-1) = 1 + 1 - 4 = -2. \blacksquare

Problem. Find and classify all critical points of f(x,y)=x3+y33xyf(x,y) = x^3 + y^3 - 3xy.

Solution

Compute the gradient:

f=(3x23y,3y23x)\nabla f = (3x^2 - 3y,\, 3y^2 - 3x)

Set f=(0,0)\nabla f = (0,0):

3x23y=0    y=x2,3y23x=0    y2=x3x^2 - 3y = 0 \implies y = x^2, \quad 3y^2 - 3x = 0 \implies y^2 = x

Substituting: (x2)2=x(x^2)^2 = xSo x4x=0x^4 - x = 0Giving x(x31)=0x(x^3 - 1) = 0So x=0x = 0 or x=1x = 1.

  • x=0x = 0: y=0y = 0. Critical point: (0,0)(0, 0).
  • x=1x = 1: y=1y = 1. Critical point: (1,1)(1, 1).

Second derivatives: fxx=6xf_{xx} = 6x, fyy=6yf_{yy} = 6y, fxy=3f_{xy} = -3.

At (0,0)(0,0): D=fxxfyyfxy2=009=9<0D = f_{xx} f_{yy} - f_{xy}^2 = 0 \cdot 0 - 9 = -9 \lt 0. Saddle point.

At (1,1)(1,1): D=669=27>0D = 6 \cdot 6 - 9 = 27 \gt 0 and fxx=6>0f_{xx} = 6 \gt 0. Local minimum with f(1,1)=1f(1,1) = -1. \blacksquare

Problem. Find the point on the plane x+2y+3z=6x + 2y + 3z = 6 closest to the origin.

Solution

Minimise f(x,y,z)=x2+y2+z2f(x,y,z) = x^2 + y^2 + z^2 subject to g(x,y,z)=x+2y+3z6=0g(x,y,z) = x + 2y + 3z - 6 = 0.

f=λg\nabla f = \lambda \nabla g:

(2x,2y,2z)=λ(1,2,3)(2x, 2y, 2z) = \lambda(1, 2, 3)

This gives x=λ/2x = \lambda/2, y=λy = \lambda, z=3λ/2z = 3\lambda/2. Substituting into the constraint:

λ2+2λ+9λ2=6    λ+4λ+9λ2=6    7λ=6    λ=67\frac{\lambda}{2} + 2\lambda + \frac{9\lambda}{2} = 6 \implies \frac{\lambda + 4\lambda + 9\lambda}{2} = 6 \implies 7\lambda = 6 \implies \lambda = \frac{6}{7}

Therefore x=3/7x = 3/7, y=6/7y = 6/7, z=9/7z = 9/7. The closest point is (3/7,6/7,9/7)(3/7,\, 6/7,\, 9/7) with Distance 9/49+36/49+81/49=126/49=3147\sqrt{9/49 + 36/49 + 81/49} = \sqrt{126/49} = \frac{3\sqrt{14}}{7}. \blacksquare

4.6 Multiple Constraints

Problem. Maximise f(x,y,z)=xyzf(x,y,z) = xyz subject to x+y+z=1x + y + z = 1 and x2+y2+z2=1/3x^2 + y^2 + z^2 = 1/3.

Solution

Set g1=x+y+z1g_1 = x + y + z - 1 and g2=x2+y2+z21/3g_2 = x^2 + y^2 + z^2 - 1/3. The Lagrange multiplier system is:

f=λ1g1+λ2g2\nabla f = \lambda_1 \nabla g_1 + \lambda_2 \nabla g_2

(yz,xz,xy)=λ1(1,1,1)+λ2(2x,2y,2z)(yz, xz, xy) = \lambda_1(1, 1, 1) + \lambda_2(2x, 2y, 2z)

This gives three equations:

yz=λ1+2λ2x,xz=λ1+2λ2y,xy=λ1+2λ2zyz = \lambda_1 + 2\lambda_2 x, \quad xz = \lambda_1 + 2\lambda_2 y, \quad xy = \lambda_1 + 2\lambda_2 z

Subtracting the first two: z(yx)=2λ2(xy)z(y - x) = 2\lambda_2(x - y)Giving (yx)(z+2λ2)=0(y - x)(z + 2\lambda_2) = 0.

Similarly, (zy)(x+2λ2)=0(z - y)(x + 2\lambda_2) = 0 and (xz)(y+2λ2)=0(x - z)(y + 2\lambda_2) = 0.

If x=y=zx = y = z: From g1g_1: 3x=13x = 1So x=1/3x = 1/3. From g2g_2: 3(1/9)=1/33(1/9) = 1/3. This satisfies both constraints.

At (1/3,1/3,1/3)(1/3, 1/3, 1/3): f=1/27f = 1/27.

If xyx \neq y: Then z+2λ2=0z + 2\lambda_2 = 0. If also yzy \neq z: x+2λ2=0x + 2\lambda_2 = 0So x=zx = z.

With x=zx = z: from x+y+z=1x + y + z = 1: 2x+y=12x + y = 1. From 2x2+y2=1/32x^2 + y^2 = 1/3: Substituting y=12xy = 1 - 2x: 6x24x+2/3=06x^2 - 4x + 2/3 = 0I.e., (3x1)2=0(3x - 1)^2 = 0So x=1/3x = 1/3 y=1/3y = 1/3. This reduces to the symmetric case.

Therefore the only critical point is (1/3,1/3,1/3)(1/3, 1/3, 1/3)Which gives f=1/27f = 1/27.

Since the constraint set is compact (intersection of a plane and a sphere in R3\mathbb{R}^3), the Extreme value theorem guarantees both a maximum and minimum exist. The maximum of xyzxyz is 1/271/27 at (1/3,1/3,1/3)(1/3, 1/3, 1/3). \blacksquare

4.7 Common Pitfalls

:::caution Common Pitfalls

  • Lagrange multipliers find candidates only. The method produces candidates for constrained extrema but does not guarantee they are extrema. Always evaluate ff at all candidates and use additional reasoning (e.g., compactness of the constraint set via the extreme value theorem) to determine which gives the max/min.
  • Boundary vs. Interior. For unconstrained problems on a closed, bounded domain, check both interior critical points and boundary points separately.
  • Degenerate Hessian. When the Hessian determinant D=0D = 0The second derivative test is inconclusive. Use higher-order Taylor expansions or direct analysis of the function near the critical point.
  • Non-normalised constraint gradients. Ensure the constraint functions are written in the form g=0g = 0; multiplying gg by a constant changes λ\lambda but not the critical points.

:::