Skip to content

Law of Large Numbers (LLN)

Definition / Introduction

  • The Law of Large Numbers (LLN) is a fundamental theorem in probability theory that describes the long-term stability of the average of random variables.
  • It essentially states that as you collect more and more independent observations from a random process, the sample average of these observations will get closer and closer to the theoretical Expected Value (or mean) \(\mu\) of the process.
  • It provides the mathematical justification for why we can estimate underlying population parameters (like the mean) using sample data.

Key Concepts

1. The Core Idea

  • Let \(X_1, X_2, ..., X_n\) be a sequence of independent and identically distributed (i.i.d.) random variables, each having a finite mean \(\mu = E[X_i]\).
  • Let \(\bar{X}_n = \frac{X_1 + X_2 + \dots + X_n}{n}\) be the sample mean (average) of the first \(n\) observations.
  • The LLN states that as the sample size \(n\) increases (\(n \to \infty\)), the sample mean \(\bar{X}_n\) converges to the true mean \(\mu\). $$ \bar{X}_n \xrightarrow{n \to \infty} \mu $$

2. Forms of the LLN

There are two main forms, differing in the type of convergence: * Weak Law of Large Numbers (WLLN): States that for any small positive number \(\epsilon > 0\), the probability that the sample mean \(\bar{X}_n\) is further than \(\epsilon\) away from the true mean \(\mu\) approaches zero as \(n\) approaches infinity. (Convergence in probability). $$ \lim_{n \to \infty} P(|\bar{X}n - \mu| > \epsilon) = 0 $$ Or equivalently: $$ \lim} P(|\bar{Xn - \mu| \le \epsilon) = 1 $$ * Strong Law of Large Numbers (SLLN): Makes a stronger statement, saying that the sample mean \(\bar{X}_n\) converges to the true mean \(\mu\) almost surely (with probability 1). $$ P(\lim_n = \mu) = 1 $$ (Almost sure convergence). * } \bar{XPractical Implication: For most data science applications, the distinction is not critical. Both forms guarantee that the sample average stabilizes around the true mean for large samples.

3. Intuition

  • In the short run, randomness can cause sample averages to fluctuate significantly.
  • In the long run, these random fluctuations tend to cancel each other out, and the average gets pulled towards the underlying expected value. The larger the sample, the more likely the sample average is to be very close to the true mean.

4. Requirements

  • The random variables must be independent.
  • They should be identically distributed (drawn from the same underlying distribution). This can be relaxed under certain conditions, but the i.i.d. case is standard.
  • The underlying distribution must have a finite mean \(\mu\). (Distributions like the Cauchy distribution do not have a finite mean, and the LLN does not apply).

Connections to Other Topics & Relevance

  • Foundation of Empirical Probability: Justifies estimating the probability \(p\) of an event by observing its relative frequency (proportion) in a large number of trials. The observed frequency converges to the true theoretical probability \(p\) (which is the expected value of a Bernoulli trial).
  • Basis for Statistical Inference: Allows us to use sample statistics (like the sample mean) to estimate population parameters (like the population mean \(\mu\)). Provides confidence that larger samples give better estimates.
  • A/B Testing: The LLN ensures that as we collect more user data in an A/B test, the observed conversion rates for each variant will converge towards their true underlying conversion rates.
  • Monte Carlo Methods: Relies on the LLN. By simulating a process many times, the average result of the simulations converges to the expected value of the quantity being estimated (e.g., estimating complex integrals or option prices).
  • Expected Value: The LLN connects the theoretical concept of expected value to the practical calculation of a sample average.

Summary

  • The Law of Large Numbers (LLN) states that the sample average (\(\bar{X}_n\)) converges to the theoretical expected value (\(\mu\)) as the sample size (\(n\)) grows infinitely large.
  • Provides justification for using sample means to estimate population means and relative frequencies to estimate probabilities.
  • Requires independent observations from a distribution with a finite mean.
  • Guarantees long-term stability of averages from random processes.
  • Fundamental principle underlying sampling, estimation, and simulation methods.

Sources