Vector and Matrix Norms: Definition¶

Simple Idea¶

A norm is basically a way to measure the "size", "length", or "magnitude" of a vector.
For matrices, norms measure their "magnitude" in a way that often relates to how much they can "stretch" vectors during matrix-vector multiplication.
Norms provide a formal concept of distance in vector spaces.

A norm on a vector space $V$ is a function $\| \cdot \| : V \to \mathbb{R}$ that assigns a non-negative real-valued length or size to each vector $\mathbf{x} \in V$, satisfying the following properties for all vectors $\mathbf{x}, \mathbf{y} \in V$ and all scalars $\alpha \in \mathbb{R}$ (or $\mathbb{C}$):
1. Non-negativity: $\|\mathbf{x}\| \ge 0$.
2. Definiteness: $\|\mathbf{x}\| = 0$ if and only if $\mathbf{x} = \mathbf{0}$ (the zero vector).
3. Absolute Homogeneity (Scaling): $\|\alpha \mathbf{x}\| = |\alpha| \|\mathbf{x}\|$.
4. Triangle Inequality: $\|\mathbf{x} + \mathbf{y}\| \le \|\mathbf{x}\| + \|\mathbf{y}\|$. (The length of a side of a triangle is less than or equal to the sum of the lengths of the other two sides).

Measure Size/Length: Quantify the magnitude of vectors.
Define Distance: The distance between two vectors $\mathbf{x}$ and $\mathbf{y}$ can be defined as the norm of their difference: $d(\mathbf{x}, \mathbf{y}) = \|\mathbf{x} - \mathbf{y}\|$.
Measure Matrix Magnitude: Quantify the "size" of a matrix, often related to its amplification effect on vectors.
Regularization in ML: Specific norms (like L₁ and L₂) are used in regularization techniques to penalize large parameter values (weights) in models, helping to prevent overfitting.
Error Measurement: Norms are used to measure the difference between predicted and actual values (e.g., the L₂ norm of the error vector in Mean Squared Error).

The most common vector norms belong to the family of $L_p$ norms (or p-norms), defined for $p \ge 1$: $$ |\mathbf{x}|p = \left( \sum $$}^n |x_i|^p \right)^{1/p
We will cover the most important cases:
- L₁ Norm ($p=1$)
- L₂ Norm ($p=2$) (The standard Euclidean length)
- L<0xE2><0x88><0x9E> Norm ($p=\infty$): Maximum absolute value ($||\mathbf{x}||_\infty = \max_i |x_i|$).

Measuring the "size" of a matrix is more complex. Common matrix norms include:
- Frobenius Norm: Analogous to the vector L₂ norm, treating the matrix as a long vector of its elements.
- Induced Norms (Operator Norms): Defined based on how the matrix transforms vectors, measuring the maximum "stretching factor" applied to vectors according to a specific vector norm (e.g., induced L₁, L₂, L<0xE2><0x88><0x9E> norms). The induced L₂ norm is also called the spectral norm and equals the largest singular value.

Used to define distance and convergence in vector spaces.
Essential for regularization in machine learning (L1 - Lasso, L2 - Ridge).
Used in optimization algorithms (e.g., checking convergence based on the norm of the gradient).
Fundamental to numerical analysis and error estimation.

A norm $\|\mathbf{x}\|$ is a function assigning a non-negative size/length to a vector (or matrix).
Must satisfy: Non-negativity, Definiteness ($\|\mathbf{x}\|=0 \iff \mathbf{x}=\mathbf{0}$), Scaling ($\|\alpha\mathbf{x}\| = |\alpha| \|\mathbf{x}\|$), Triangle Inequality ($\|\mathbf{x}+\mathbf{y}\| \le \|\mathbf{x}\|+\|\mathbf{y}\|$).
Used to measure vector magnitude, distance, matrix size, model parameter penalties (regularization), and errors.
Common vector norms include L₁, L₂, L<0xE2><0x88><0x9E>. Common matrix norm is Frobenius.

Deep Learning by Goodfellow, Bengio, and Courville (Chapter 2) (https://www.deeplearningbook.org/contents/linear_algebra.html)
Introduction to Linear Algebra by Gilbert Strang (Often discussed later, check index)
Wikipedia: Norm (mathematics)
Numerical Linear Algebra by Trefethen and Bau (Discusses matrix norms)