Finding maxima (peaks) and minima (valleys) of functions is a core task in optimization. Calculus gives us tools to find these points.
Single Variable: Think about finding the highest or lowest point on a curve \(y=f(x)\). At these points, the slope (derivative) is usually zero (the tangent line is horizontal).
Multiple Variables: Think about finding peaks or valley bottoms on a surface \(z=f(x, y)\). At these points, the "ground" is flat in all directions, meaning all partial derivatives (and thus the gradient) must be zero.
Local Maximum: A point \(c\) where \(f(c) \ge f(x)\) for all \(x\) in some open interval around\(c\). It's a peak in its immediate neighborhood.
Local Minimum: A point \(c\) where \(f(c) \le f(x)\) for all \(x\) in some open interval around\(c\). It's a valley bottom in its immediate neighborhood.
Global Maximum: A point \(c\) where \(f(c) \ge f(x)\) for all\(x\) in the function's entire domain. The absolute highest point.
Global Minimum: A point \(c\) where \(f(c) \le f(x)\) for all\(x\) in the function's entire domain. The absolute lowest point.
Single Variable: A point \(c\) in the domain of \(f\) is a critical point if either \(f'(c) = 0\) or \(f'(c)\) does not exist.
Multiple Variables: A point \(\mathbf{a}\) in the domain of \(f\) is a critical point if either \(\nabla f(\mathbf{a}) = \mathbf{0}\) (the zero vector, meaning all partial derivatives are zero) or if \(\nabla f(\mathbf{a})\) does not exist.
Fermat's Theorem (for local extrema): If \(f\) has a local maximum or minimum at an interior point \(c\) (or \(\mathbf{a}\)) and the derivative (or gradient) exists there, then \(f'(c) = 0\) (or \(\nabla f(\mathbf{a}) = \mathbf{0}\)).
Important: The condition \(f'(c)=0\) or \(\nabla f(\mathbf{a})=\mathbf{0}\) is necessary for a local extremum (if the derivative/gradient exists), but it is not sufficient. Points where the derivative/gradient is zero could also be saddle points.
If \(f''(c) > 0\): \(f\) is concave up, so \(c\) is a local minimum.
If \(f''(c) < 0\): \(f\) is concave down, so \(c\) is a local maximum.
If \(f''(c) = 0\): The test is inconclusive.
3. Second Derivative Test (Multiple Variables - using Hessian)¶
Find critical points where \(\nabla f(\mathbf{a}) = \mathbf{0}\).
Compute the Hessian matrix\(\mathbf{H}(\mathbf{a})\), the matrix of second partial derivatives evaluated at \(\mathbf{a}\).
Calculate the determinant of the Hessian, \(D = \det(\mathbf{H}(\mathbf{a}))\). Let \(f_{xx} = \partial^2 f / \partial x^2\) evaluated at \(\mathbf{a}\).
If \(D > 0\) and \(f_{xx}(\mathbf{a}) > 0\): Local minimum. (Hessian is positive definite).
If \(D > 0\) and \(f_{xx}(\mathbf{a}) < 0\): Local maximum. (Hessian is negative definite).
If \(D < 0\): Saddle point. (Hessian is indefinite).
If \(D = 0\): The test is inconclusive.
4. Finding Global Extrema (on a closed interval/region)¶
Find all critical points inside the interval/region.
Evaluate the function \(f\) at these critical points.
Evaluate the function \(f\) at the boundaries of the interval/region.
The largest value found is the global maximum; the smallest is the global minimum.
Gradient Descent & Optimization Algorithms: These algorithms aim to find minima (usually local minima) of loss functions by iteratively moving towards points where the gradient is zero.
Model Training: Finding the optimal parameters for a machine learning model involves minimizing a loss function, i.e., finding its minimum value.
Maximum Likelihood Estimation (MLE): A statistical method that finds parameter values maximizing the likelihood function (finding a maximum).
Convex Functions: For convex functions, any local minimum is also a global minimum, which significantly simplifies optimization.
Extrema are maximum or minimum values of a function.
Local Extrema occur at critical points where the derivative\(f'(x)=0\) (single var) or the gradient\(\nabla f(\mathbf{a})=\mathbf{0}\) (multi var), or where they don't exist.
First/Second Derivative Tests (single var) or the Second Derivative Test using the Hessian Matrix (multi var) help classify critical points as local maxima, minima, or saddle points.