While Expected Value\(E[X]\) describes the center of a random variable's distribution, Variance\(Var(X)\) measures its spread or dispersion.
It quantifies how much the values of the random variable \(X\) tend to deviate from their mean \(\mu = E[X]\) on average. A higher variance means values are more spread out; lower variance means they are more clustered around the mean.
Standard Deviation\(SD(X)\) or \(\sigma_X\) is simply the square root of the variance. It's often preferred for interpretation because it's in the same units as the random variable \(X\).
Notation: Variance \(Var(X)\), \(\sigma^2\), or \(\sigma_X^2\). Standard Deviation \(SD(X)\), \(\sigma\), or \(\sigma_X\).
1. Definition based on Expected Squared Deviation¶
The variance is defined as the expected value of the squared difference between the random variable \(X\) and its mean \(\mu = E[X]\):
$$ Var(X) = E[(X - \mu)^2] $$
A more convenient formula for calculating variance is derived from the definition using linearity of expectation:
$$ Var(X) = E[X^2] - (E[X])^2 $$
$$ Var(X) = E[X^2] - \mu^2 $$
This requires calculating two expected values: \(E[X]\) (the mean) and \(E[X^2]\) (the expected value of X squared, calculated using LOTUS as \(\sum x^2 p(x)\) or \(\int x^2 f(x) dx\)).
The standard deviation is the positive square root of the variance:
$$ SD(X) = \sigma = \sqrt{Var(X)} = \sqrt{E[(X - \mu)^2]} $$
Interpretation:\(\sigma\) measures the typical or average distance of the values of \(X\) from their mean \(\mu\). A smaller \(\sigma\) means data points are typically close to the mean; a larger \(\sigma\) means they are typically far from the mean.
Non-negativity:\(Var(X) \ge 0\). Variance is zero if and only if \(X\) is a constant (no spread).
Constants:\(Var(b) = 0\) for any constant \(b\).
Linear Transformation: For constants \(a\) and \(b\):
$$ Var(aX + b) = a^2 Var(X) $$
Note: Adding a constant \(b\) shifts the distribution but doesn't change its spread, so \(b\) disappears. Multiplying by \(a\) scales the deviations, and squaring \(a\) reflects the squaring in the variance definition.
For any random variables \(X\) and \(Y\):
$$ Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y) $$
where \(Cov(X, Y)\) is the Covariance between X and Y.
If X and Y are Independent: Then \(Cov(X, Y) = 0\), and the formula simplifies significantly:
$$ Var(X + Y) = Var(X) + Var(Y) \quad (\text{if X, Y independent}) $$
$$ Var(X - Y) = Var(X) + Var(Y) \quad (\text{if X, Y independent}) $$
Note: Variance ADDS even when subtracting independent variables because subtraction can still increase the range of outcomes.
Risk Assessment: In finance and business, variance and standard deviation are key measures of risk or volatility (e.g., of investment returns).
Data Analysis & Feature Scaling: Standard deviation is used in standardization (calculating Z-scores: \(Z = (X - \mu) / \sigma\)) which is essential for many machine learning algorithms. Understanding variance helps interpret feature importance and variability.
Confidence Intervals & Hypothesis Testing: Standard deviation (often estimated from samples as the standard error) is crucial for constructing confidence intervals and performing hypothesis tests about population means.
Normal Distribution & Empirical Rule: Standard deviation defines the intervals (\(\mu \pm k\sigma\)) containing specific percentages of data (68%, 95%, 99.7%).
Process Control: Used to monitor the stability and consistency of processes (e.g., manufacturing).