An inner product on a vector space V over F (where F=R or C) is a Function ⟨⋅,⋅⟩:V×V→F satisfying:
Conjugate symmetry: ⟨u,v⟩=⟨v,u⟩
Linearity in the first argument: ⟨αu+βw,v⟩=α⟨u,v⟩+β⟨w,v⟩
Positive definiteness: ⟨v,v⟩≥0 with equality iff v=0
A vector space equipped with an inner product is called an inner product space.
Example. The standard inner product on Rn is ⟨x,y⟩=∑i=1nxiyi. On Cn⟨x,y⟩=∑i=1nxiyi.
Example. On C[a,b]The L2 inner product is ⟨f,g⟩=∫abf(x)g(x)dx.
7.2 Norms
Every inner product induces a norm:
∥v∥=⟨v,v⟩
Theorem 7.1 (Cauchy—Schwarz Inequality). For all u,v∈V
∣⟨u,v⟩∣≤∥u∥∥v∥
With equality if and only if u and v are linearly dependent.
Proof. If v=0Both sides are 0 and the result holds. Assume v=0. For any t∈R (or C), positive definiteness gives
0≤⟨u−tv,u−tv⟩=⟨u,u⟩−t⟨v,u⟩−t⟨u,v⟩+∣t∣2⟨v,v⟩
Set t=⟨v,v⟩⟨u,v⟩ (the value that minimises the right side):
0≤∥u∥2−∥v∥2∣⟨u,v⟩∣2
Rearranging: ∣⟨u,v⟩∣2≤∥u∥2∥v∥2. Taking square roots gives the result. Equality holds iff u−tv=0 I.e., u and v are linearly dependent. ■
Theorem 7.2 (Triangle Inequality).
∥u+v∥≤∥u∥+∥v∥
Proof.
∥u+v∥2=⟨u+v,u+v⟩=∥u∥2+2Re⟨u,v⟩+∥v∥2
By Cauchy—Schwarz, Re⟨u,v⟩≤∣⟨u,v⟩∣≤∥u∥∥v∥So
∥u+v∥2≤∥u∥2+2∥u∥∥v∥+∥v∥2=(∥u∥+∥v∥)2
Taking square roots gives the result. ■
7.3 Orthogonality
Two vectors u,v are orthogonal if ⟨u,v⟩=0. We write u⊥v.
An orthonormal set{e1,…,ek} satisfies ⟨ei,ej⟩=δij.
Theorem 7.3 (Pythagorean Theorem). If u⊥vThen
∥u+v∥2=∥u∥2+∥v∥2
Proof.∥u+v∥2=∥u∥2+2⟨u,v⟩+∥v∥2=∥u∥2+∥v∥2. ■
Proposition 7.4. Every orthonormal set is linearly independent.
Proof. If ∑i=1kαiei=0Then taking the inner product with ej: αj=⟨∑αiei,ej⟩=⟨0,ej⟩=0 for each j. ■
7.4 Gram—Schmidt Process
The Gram—Schmidt process converts a linearly independent set {v1,…,vn} into an orthonormal set {e1,…,en}:
u1=v1,e1=∥u1∥u1
uk=vk−∑i=1k−1⟨vk,ei⟩ei,ek=∥uk∥uk
Proposition 7.5. At each step, span{e1,…,ek}=span{v1,…,vk}.
Proof. By construction, uk is vk minus its projection onto span{e1,…,ek−1}=span{v1,…,vk−1}. So uk∈span{v1,…,vk} and vk=uk+∑i=1k−1⟨vk,ei⟩ei∈span{u1,…,uk}. Since each ei is a scalar multiple of uiThe spans coincide. ■
7.5 Orthogonal Projection
The orthogonal projection of v onto a subspace W with orthonormal basis {e1,…,ek} is
projW(v)=∑i=1k⟨v,ei⟩ei
Theorem 7.6 (Best Approximation). Among all vectors in WThe orthogonal projection projW(v) minimises the distance to v:
∥v−projW(v)∥≤∥v−w∥forallw∈W
Proof. For any w∈WWrite v−w=(v−projW(v))+(projW(v)−w). The first term is orthogonal to W (hence to the second term, which lies in W), so by the Pythagorean theorem:
A fundamental application of orthogonal projection is fitting functions to data. Given a subspace W of an inner product space V and a target v∈VThe best approximation in W Is the orthogonal projection projW(v).
7.7 Worked Example: Gram—Schmidt
Problem. Apply the Gram—Schmidt process to v1=(1,1,0)v2=(1,0,1), v3=(0,1,1) in R3 with the standard inner Product.
The orthonormal basis is {21(1,1,0),61(1,−1,2),31(−1,1,1)}. ■
:::caution Common Pitfall The Gram—Schmidt process requires a linearly independent starting set. If the input vectors are Linearly dependent, one of the uk will be the zero vector, and the process will fail (attempting to divide by zero in the normalisation step).
7.8 Worked Example: Orthogonal Projection onto a Plane
Problem. Find the orthogonal projection of v=(3,−1,2) onto the plane W spanned by (1,0,1) and (0,1,1) in R3 with the standard inner product. Also find the distance from v to W.
Solution
First, apply Gram—Schmidt to obtain an orthonormal basis for W.
The residual is v−projW(v)=(0,0,0)So the distance is 0. This means v∈W itself. Indeed, v=3(1,0,1)−(0,1,1)∈span{(1,0,1),(0,1,1)}. ■
7.9 Worked Example: L2 Least Squares Approximation
Problem. Find the constant function c (i.e., the best approximation by a degree-0 polynomial) That minimises ∫01(ex−c)2dx.
Solution
We want the orthogonal projection of f(x)=ex onto the subspace W=span{1} in the L2[0,1] inner product space. The orthonormal basis for W is e1=1 (since ∥1∥2=∫011dx=1).
projW(f)=⟨f,1⟩⋅1=(∫01exdx)⋅1=(e−1)⋅1
So the best constant approximation is c=e−1≈1.718.
Verification: The error is ex−(e−1). Expanding ex as a Taylor series around x=1/2: The constant term is e1/2≈1.649But our answer e−1≈1.718 is the L2-optimal constant, not the Taylor approximation. The two optimisation criteria differ. ■
7.10 Common Pitfalls
The Cauchy—Schwarz inequality is not the triangle inequality. Cauchy—Schwarz bounds the inner product by the product of norms; the triangle inequality bounds the norm of a sum by the sum of norms. They are related (the triangle inequality follows from Cauchy—Schwarz) but distinct.
Gram—Schmidt is numerically unstable. For floating-point computation, modified Gram—Schmidt or Householder reflections are preferred.
Orthogonal projection decomposes v uniquely.v=projW(v)+v⊥ where v⊥∈W⊥. This decomposition is unique and is called the orthogonal decomposition.