A matrix is a rectangular array or grid of numbers (scalars), arranged in rows and columns.
Matrices are used to represent linear transformations (like rotation, scaling, shearing), systems of linear equations, datasets, and parameters in machine learning models.
They provide a compact way to organize and manipulate related data or coefficients.
Notation: Usually denoted by uppercase bold letters (e.g., \(\mathbf{A}, \mathbf{X}, \mathbf{W}\)).
Elements/Entries: The individual numbers (scalars) within the matrix. An entry is identified by its row index \(i\) and column index \(j\), often written as \(A_{ij}\), \(a_{ij}\), or \((\mathbf{A})_{ij}\).
Dimensions (Shape): Defined by the number of rows (\(m\)) and the number of columns (\(n\)). A matrix with \(m\) rows and \(n\) columns is called an "\(m\) by \(n\)" matrix (written \(m \times n\)). \(\mathbf{A} \in \mathbb{R}^{m \times n}\) is an \(m \times n\) real matrix.
Square Matrix: Number of rows equals number of columns (\(m = n\)).
Identity Matrix (\(\mathbf{I}\)): A square matrix with 1s on the main diagonal (top-left to bottom-right) and 0s elsewhere. Acts like the number 1 in matrix multiplication.
Zero Matrix (\(\mathbf{0}\)): A matrix where all entries are 0.
Diagonal Matrix: A square matrix where all off-diagonal entries (\(i \neq j\)) are 0.
Symmetric Matrix: A square matrix where \(A_{ij} = A_{ji}\) for all \(i, j\) (equal to its transpose: \(\mathbf{A} = \mathbf{A}^T\)). Covariance matrices are symmetric.
Transpose (\(\mathbf{A}^T\)): Obtained by swapping rows and columns. The transpose of an \(m \times n\) matrix is an \(n \times m\) matrix where \((\mathbf{A}^T)_{ij} = A_{ji}\).
Dataset Representation: An entire dataset can be represented as a matrix where rows correspond to data points (samples) and columns correspond to features. (\(n_{\text{samples}} \times n_{\text{features}}\)).
Linear Transformations: Matrices are used to represent linear operations like rotation, scaling, and shearing in computer graphics or feature transformations. Multiplying a vector by a matrix transforms the vector.
Systems of Linear Equations: The coefficients of a system \(\mathbf{Ax = b}\) are represented by matrix \(\mathbf{A}\).
Neural Network Weights: The weights connecting neurons between two layers are often stored in a matrix \(\mathbf{W}\), where \(W_{ij}\) might be the weight from neuron \(j\) in the previous layer to neuron \(i\) in the current layer.
Covariance Matrix: A square matrix \(\mathbf{\Sigma}\) representing the pairwise covariances between different features in a dataset. Used in PCA, etc. \(\Sigma_{ij} = Cov(X_i, X_j)\).
Adjacency Matrix: Represents connections in a graph, where \(A_{ij} = 1\) if node \(i\) is connected to node \(j\), and 0 otherwise.