Skip to content

Matrices

Definition / Introduction

  • A matrix is a rectangular array or grid of numbers (scalars), arranged in rows and columns.
  • Matrices are used to represent linear transformations (like rotation, scaling, shearing), systems of linear equations, datasets, and parameters in machine learning models.
  • They provide a compact way to organize and manipulate related data or coefficients.

Key Concepts

1. Representation

  • Notation: Usually denoted by uppercase bold letters (e.g., \(\mathbf{A}, \mathbf{X}, \mathbf{W}\)).
  • Elements/Entries: The individual numbers (scalars) within the matrix. An entry is identified by its row index \(i\) and column index \(j\), often written as \(A_{ij}\), \(a_{ij}\), or \((\mathbf{A})_{ij}\).
  • Dimensions (Shape): Defined by the number of rows (\(m\)) and the number of columns (\(n\)). A matrix with \(m\) rows and \(n\) columns is called an "\(m\) by \(n\)" matrix (written \(m \times n\)). \(\mathbf{A} \in \mathbb{R}^{m \times n}\) is an \(m \times n\) real matrix.
  • Example: A \(2 \times 3\) matrix \(\mathbf{A}\): $$ \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \end{bmatrix} $$

2. Rows and Columns

  • Each row of an \(m \times n\) matrix can be viewed as a row vector of dimension \(n\).
  • Each column of an \(m \times n\) matrix can be viewed as a column vector of dimension \(m\).

3. Special Types of Matrices

  • Square Matrix: Number of rows equals number of columns (\(m = n\)).
  • Identity Matrix (\(\mathbf{I}\)): A square matrix with 1s on the main diagonal (top-left to bottom-right) and 0s elsewhere. Acts like the number 1 in matrix multiplication.
  • Zero Matrix (\(\mathbf{0}\)): A matrix where all entries are 0.
  • Diagonal Matrix: A square matrix where all off-diagonal entries (\(i \neq j\)) are 0.
  • Symmetric Matrix: A square matrix where \(A_{ij} = A_{ji}\) for all \(i, j\) (equal to its transpose: \(\mathbf{A} = \mathbf{A}^T\)). Covariance matrices are symmetric.
  • Transpose (\(\mathbf{A}^T\)): Obtained by swapping rows and columns. The transpose of an \(m \times n\) matrix is an \(n \times m\) matrix where \((\mathbf{A}^T)_{ij} = A_{ji}\).

4. Examples in Data Science / AI

  • Dataset Representation: An entire dataset can be represented as a matrix where rows correspond to data points (samples) and columns correspond to features. (\(n_{\text{samples}} \times n_{\text{features}}\)).
  • Linear Transformations: Matrices are used to represent linear operations like rotation, scaling, and shearing in computer graphics or feature transformations. Multiplying a vector by a matrix transforms the vector.
  • Systems of Linear Equations: The coefficients of a system \(\mathbf{Ax = b}\) are represented by matrix \(\mathbf{A}\).
  • Neural Network Weights: The weights connecting neurons between two layers are often stored in a matrix \(\mathbf{W}\), where \(W_{ij}\) might be the weight from neuron \(j\) in the previous layer to neuron \(i\) in the current layer.
  • Covariance Matrix: A square matrix \(\mathbf{\Sigma}\) representing the pairwise covariances between different features in a dataset. Used in PCA, etc. \(\Sigma_{ij} = Cov(X_i, X_j)\).
  • Adjacency Matrix: Represents connections in a graph, where \(A_{ij} = 1\) if node \(i\) is connected to node \(j\), and 0 otherwise.

Connections to Other Topics

Summary

  • A matrix (\(\mathbf{A}\)) is a rectangular array of numbers (scalars) arranged in rows and columns. Dimensions are \(m \times n\).
  • Used to represent datasets, linear transformations, systems of equations, model parameters (NN weights), covariance structures.
  • Fundamental object for manipulating blocks of data and representing linear relationships.

Sources