A norm on a vector space\(V\) is a function \(\| \cdot \| : V \to \mathbb{R}\) that assigns a non-negative real-valued length or size to each vector \(\mathbf{x} \in V\), satisfying the following properties for all vectors \(\mathbf{x}, \mathbf{y} \in V\) and all scalars\(\alpha \in \mathbb{R}\) (or \(\mathbb{C}\)):
Non-negativity:\(\|\mathbf{x}\| \ge 0\).
Definiteness:\(\|\mathbf{x}\| = 0\) if and only if \(\mathbf{x} = \mathbf{0}\) (the zero vector).
Triangle Inequality:\(\|\mathbf{x} + \mathbf{y}\| \le \|\mathbf{x}\| + \|\mathbf{y}\|\). (The length of a side of a triangle is less than or equal to the sum of the lengths of the other two sides).
Measure Size/Length: Quantify the magnitude of vectors.
Define Distance: The distance between two vectors \(\mathbf{x}\) and \(\mathbf{y}\) can be defined as the norm of their difference: \(d(\mathbf{x}, \mathbf{y}) = \|\mathbf{x} - \mathbf{y}\|\).
Measure Matrix Magnitude: Quantify the "size" of a matrix, often related to its amplification effect on vectors.
Regularization in ML: Specific norms (like L₁ and L₂) are used in regularization techniques to penalize large parameter values (weights) in models, helping to prevent overfitting.
Error Measurement: Norms are used to measure the difference between predicted and actual values (e.g., the L₂ norm of the error vector in Mean Squared Error).
The most common vector norms belong to the family of \(L_p\) norms (or p-norms), defined for \(p \ge 1\):
$$ |\mathbf{x}|p = \left( \sum $$}^n |x_i|^p \right)^{1/p
Measuring the "size" of a matrix is more complex. Common matrix norms include:
Frobenius Norm: Analogous to the vector L₂ norm, treating the matrix as a long vector of its elements.
Induced Norms (Operator Norms): Defined based on how the matrix transforms vectors, measuring the maximum "stretching factor" applied to vectors according to a specific vector norm (e.g., induced L₁, L₂, L<0xE2><0x88><0x9E> norms). The induced L₂ norm is also called the spectral norm and equals the largest singular value.