Skip to content

Limit Theorems

4.1 The Law of Large Numbers

Theorem 4.1 (Weak Law of Large Numbers). Let X1,X2,X_1, X_2, \ldots be i.i.d. With E[Xi]=μE[X_i] = \mu and Var(Xi)=σ2<\mathrm{Var}(X_i) = \sigma^2 < \infty. Then for every ε>0\varepsilon > 0:

limnP(1ni=1nXiμε)=0\lim_{n \to \infty} P\left(\left|\frac{1}{n}\sum_{i=1}^{n} X_i - \mu\right| \geq \varepsilon\right) = 0

Proof. Let Sn=1ni=1nXiS_n = \frac{1}{n}\sum_{i=1}^{n} X_i. Then E[Sn]=μE[S_n] = \mu and Var(Sn)=σ2/n\mathrm{Var}(S_n) = \sigma^2/n. By Chebyshev”s inequality:

P(Snμε)Var(Sn)ε2=σ2nε20as nP(|S_n - \mu| \geq \varepsilon) \leq \frac{\mathrm{Var}(S_n)}{\varepsilon^2} = \frac{\sigma^2}{n\varepsilon^2} \to 0 \quad \mathrm{as\ } n \to \infty

\blacksquare

Theorem 4.2 (Strong Law of Large Numbers). Under the same conditions:

P(limn1ni=1nXi=μ)=1P\left(\lim_{n \to \infty} \frac{1}{n}\sum_{i=1}^{n} X_i = \mu\right) = 1

The sample mean converges to the population mean almost surely.

4.2 The Central Limit Theorem

Theorem 4.3 (Central Limit Theorem). Let X1,X2,X_1, X_2, \ldots be i.i.d. With E[Xi]=μE[X_i] = \mu and Var(Xi)=σ2(0,)\mathrm{Var}(X_i) = \sigma^2 \in (0, \infty). Then

SnnμσndN(0,1)\frac{S_n - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} N(0, 1)

Where Sn=i=1nXiS_n = \sum_{i=1}^{n} X_i and d\xrightarrow{d} denotes convergence in distribution.

Equivalently, for large nn:

P(Snnμσnz)Φ(z)P\left(\frac{S_n - n\mu}{\sigma\sqrt{n}} \leq z\right) \approx \Phi(z)

Where Φ\Phi is the CDF of the standard normal.

Proof (using characteristic functions). Let φX(t)=E[eitX]\varphi_X(t) = E[e^{itX}] be the characteristic function of X1X_1. The characteristic function of (Snnμ)/(σn)(S_n - n\mu)/(\sigma\sqrt{n}) is:

φn(t)=[φX(tσn)]neitnμ/σ\varphi_n(t) = \left[\varphi_X\left(\frac{t}{\sigma\sqrt{n}}\right)\right]^n \cdot e^{-it\sqrt{n}\mu/\sigma}

Expanding φX\varphi_X around 0: φX(s)=1+iμs(σ2+μ2)s22+o(s2)\varphi_X(s) = 1 + i\mu s - \frac{(\sigma^2 + \mu^2)s^2}{2} + o(s^2). Substituting s=t/(σn)s = t/(\sigma\sqrt{n}):

φn(t)=[1+iμtσn(σ2+μ2)t22σ2n+o(1n)]neitnμ/σ\varphi_n(t) = \left[1 + \frac{i\mu t}{\sigma\sqrt{n}} - \frac{(\sigma^2 + \mu^2)t^2}{2\sigma^2 n} + o\left(\frac{1}{n}\right)\right]^n \cdot e^{-it\sqrt{n}\mu/\sigma}

Using limn(1+an/n)n=eliman\lim_{n \to \infty}(1 + a_n/n)^n = e^{\lim a_n}:

limnφn(t)=exp(iμtσ(σ2+μ2)t22σ2)exp(iμtσ)=et2/2\lim_{n \to \infty} \varphi_n(t) = \exp\left(\frac{i\mu t}{\sigma} - \frac{(\sigma^2 + \mu^2)t^2}{2\sigma^2}\right) \cdot \exp\left(-\frac{i\mu t}{\sigma}\right) = e^{-t^2/2}

This is the characteristic function of N(0,1)N(0, 1). By Levy’s continuity theorem, the convergence in distribution follows. \blacksquare

4.3 Worked Examples

Problem. A fair die is rolled 100 times. Approximate the probability that the sum exceeds 370.

Solution

Let XiX_i be the value of the ii-th roll. Then E[Xi]=7/2=3.5E[X_i] = 7/2 = 3.5 and Var(Xi)=35/122.917\mathrm{Var}(X_i) = 35/12 \approx 2.917.

S100=i=1100XiS_{100} = \sum_{i=1}^{100} X_i. By the CLT:

S10035010035/12N(0,1)\frac{S_{100} - 350}{\sqrt{100 \cdot 35/12}} \approx N(0, 1)

P(S100>370)=P(Z>370350291.7)P(Z>1.17)0.121P(S_{100} > 370) = P\left(Z > \frac{370 - 350}{\sqrt{291.7}}\right) \approx P(Z > 1.17) \approx 0.121

\blacksquare

Worked Example: Sample Mean Distribution

Solution. A population has mean 50 and standard deviation 10. Find the probability that the mean of a sample of 64 observations exceeds 52.

By the CLT, XˉN(50,100/64)=N(50,1.5625)\bar{X} \approx N(50, 100/64) = N(50, 1.5625).

P(Xˉ>52)=P(Z>52501.5625)=P(Z>1.6)0.0548P(\bar{X} > 52) = P\left(Z > \frac{52 - 50}{\sqrt{1.5625}}\right) = P(Z > 1.6) \approx 0.0548

\blacksquare

4.4 Common Pitfalls

  • The CLT does not apply to small samples. The CLT is an asymptotic result. For small nn ( n<30n < 30), the normal approximation can be poor unless the underlying distribution is already close to normal. Use the Berry—Esseen theorem for finite-sample bounds.
  • Independence is critical for the LLN and CLT. If the XiX_i are dependent, the sample mean may not converge to the population mean, or the convergence rate may differ. For stationary sequences with weak dependence, versions of these theorems still hold, but the …/1-number-and-algebra/3_proof-and-logics are more involved.
  • Convergence in distribution is weaker than convergence in probability. The CLT gives convergence in distribution of the standardised sum, not convergence of the sum itself. The LLN gives the latter (convergence in probability).