Matrices¶

Definition / Introduction¶

A matrix is a rectangular array or grid of numbers (scalars), arranged in rows and columns.
Matrices are used to represent linear transformations (like rotation, scaling, shearing), systems of linear equations, datasets, and parameters in machine learning models.
They provide a compact way to organize and manipulate related data or coefficients.

Notation: Usually denoted by uppercase bold letters (e.g., $\mathbf{A}, \mathbf{X}, \mathbf{W}$).
Elements/Entries: The individual numbers (scalars) within the matrix. An entry is identified by its row index $i$ and column index $j$, often written as $A_{ij}$, $a_{ij}$, or $(\mathbf{A})_{ij}$.
Dimensions (Shape): Defined by the number of rows ($m$) and the number of columns ($n$). A matrix with $m$ rows and $n$ columns is called an "$m$ by $n$" matrix (written $m \times n$). $\mathbf{A} \in \mathbb{R}^{m \times n}$ is an $m \times n$ real matrix.
Example: A $2 \times 3$ matrix $\mathbf{A}$: $$ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \end{bmatrix} $$

Each row of an $m \times n$ matrix can be viewed as a row vector of dimension $n$.
Each column of an $m \times n$ matrix can be viewed as a column vector of dimension $m$.

Square Matrix: Number of rows equals number of columns ($m = n$).
Identity Matrix ($\mathbf{I}$): A square matrix with 1s on the main diagonal (top-left to bottom-right) and 0s elsewhere. Acts like the number 1 in matrix multiplication.
Zero Matrix ($\mathbf{0}$): A matrix where all entries are 0.
Diagonal Matrix: A square matrix where all off-diagonal entries ($i \neq j$) are 0.
Symmetric Matrix: A square matrix where $A_{ij} = A_{ji}$ for all $i, j$ (equal to its transpose: $\mathbf{A} = \mathbf{A}^T$). Covariance matrices are symmetric.
Transpose ($\mathbf{A}^T$): Obtained by swapping rows and columns. The transpose of an $m \times n$ matrix is an $n \times m$ matrix where $(\mathbf{A}^T)_{ij} = A_{ji}$.

Dataset Representation: An entire dataset can be represented as a matrix where rows correspond to data points (samples) and columns correspond to features. ($n_{\text{samples}} \times n_{\text{features}}$).
Linear Transformations: Matrices are used to represent linear operations like rotation, scaling, and shearing in computer graphics or feature transformations. Multiplying a vector by a matrix transforms the vector.
Systems of Linear Equations: The coefficients of a system $\mathbf{Ax = b}$ are represented by matrix $\mathbf{A}$.
Neural Network Weights: The weights connecting neurons between two layers are often stored in a matrix $\mathbf{W}$, where $W_{ij}$ might be the weight from neuron $j$ in the previous layer to neuron $i$ in the current layer.
Covariance Matrix: A square matrix $\mathbf{\Sigma}$ representing the pairwise covariances between different features in a dataset. Used in PCA, etc. $\Sigma_{ij} = Cov(X_i, X_j)$.
Adjacency Matrix: Represents connections in a graph, where $A_{ij} = 1$ if node $i$ is connected to node $j$, and 0 otherwise.

Matrices are composed of Scalars and can be seen as collections of row or column Vectors.
Matrix Operations (addition, scalar multiplication, matrix multiplication, transpose) define how matrices interact.
Concepts like Inverse, Determinant, Rank, Eigenvalues/Eigenvectors, and SVD are key properties and operations related to matrices.
Tensors generalize matrices to more than two dimensions.

A matrix ($\mathbf{A}$) is a rectangular array of numbers (scalars) arranged in rows and columns. Dimensions are $m \times n$.
Used to represent datasets, linear transformations, systems of equations, model parameters (NN weights), covariance structures.
Fundamental object for manipulating blocks of data and representing linear relationships.

Deep Learning by Goodfellow, Bengio, and Courville (Chapter 2) (https://www.deeplearningbook.org/contents/linear_algebra.html)
Introduction to Linear Algebra by Gilbert Strang (Chapter 1 & 2)
Khan Academy: Introduction to Matrices
3Blue1Brown: Matrix Multiplication | Essence of Linear Algebra (Focuses on transformation aspect)