Derivative Testing and Special Points

Derivatives are useful for analyzing the behaviour of functions and finding where the important points are. Let’s see how these points are classified and how to find them with derivatives.

We’ll just be covering up to real multivariate vector-valued functions: $$ f: \mathbb{R}^n \rightarrow \mathbb{R}^m $$

Types of Points

The broadest class of these are called critical points. These are points on the function where $f'$ is either zero or is undefined.

Points where $f'$ is zero are stationary points. All stationary points are critical points and all differentiable critical points are stationary points.

If the sign of $f'$ changes at a critical point, we call it a turning point. If it doesn’t it’s an inflexion point (specifically a stationary inflexion point). All stationary points are either turning points or inflexion points but not necessarily the other way around. We also have non-stationary inflexion points where $f' \neq 0$.

Inflexion points are defined by the function changing curvature at that point. The convexity/concavity swap. We see this when $f''$ changes sign.

For inflexion points, if the sign of $f'$ is positive on either side, it’s a rising inflexion point and if negative, a falling inflexion point. We say that non-stationary inflexion points are strictly rising or falling.

Points of the greatest or least value in a range are extreme points (extrema). If that range covers the whole function, the point is a global (absolute) extremum and if the range is less it’s called local (relative).

All turning points are local extreme points but not the other way around. Extreme points at the boundaries of a function won’t generally be turning points.

Points that are stationary but not local extrema are saddle points.

Finding the points

So we’ve defined the points we generally care about, now let’s find them. I’m going to cover four different cases:

Univariate Multivariate
Scalar-Valued $\mathbb{R}^1 \rightarrow \mathbb{R}^1$ $\mathbb{R}^n \rightarrow \mathbb{R}^1$
Vector-Valued $\mathbb{R}^1 \rightarrow \mathbb{R}^m$ $\mathbb{R}^n \rightarrow \mathbb{R}^m$

Univariate Scalar-Valued Functions

This is the classic simplest case. We want to find critical points of a function $f: \mathbb{R} \rightarrow \mathbb{R}$.

Stationary points are found as solutions of $f'=0$. The first derivative test says that if $f'$ changes from positive to negative it’s a maxima, negative to positive a minima. For a critical point $c$ we can check a sufficiently small neighborhood: $$ \text{sgn}(f'(c-\epsilon))=- \text{sgn}(f'(c+\epsilon))$$

We can restate this using the second derivative test. If $f'' > 0$ it’s a minima (the function is concave up), if $f'' < 0$ it’s a maxima (function is concave down), and if $f'' = 0$ the test is inconclusive.

When the second derivative test is inconclusive, we use the higher order derivative test. This works for sufficiently differentiable functions that have a non-zero derivative at the point. Find the order $n$ of the first non-zero derivative of $f$ at the point, then classify it according to this chart:

$f^{(n)} > 0$ $f^{(n)} < 0$
$\textit{\textbf{n}}$ is Even Local Minima Local Maxima
$\textit{\textbf{n}}$ is Odd Rising Inflection Falling Inflection

Multivariate Scalar-Valued Functions

Now we find the critical points of $f: \mathbb{R}^n \rightarrow \mathbb{R}$. The analogous operator for the multivariate case is the vector of partial derivatives called the gradient: $$\nabla f = \begin{bmatrix} f_{x_1}, & \ldots & , f_{x_n} \end{bmatrix}^T$$

As expected, the gradient will be zero at stationary points so we solve for $\nabla f = \vec{0}$. Our analogous higher-order derivative test is going to be the Hessian matrix:

$$ \mathbf{H}(f)=\partial^2_{{x_i}{x_j}}(f) $$

Here we find the eigenvalues of the Hessian matrix and classify the point this way:

Eigenvalues Critical Point
All Positive Local Minimum
All Negative Local Maximum
Mixed Saddle Point

Vector-Valued Functions

For vector-valued functions ($f: \mathbb{R}^n \rightarrow \mathbb{R}^m$), we need a scalar metric ($g: \mathbb{R}^m \rightarrow \mathbb{R} $) to compare the vectors. This is because vectors are not naturally ordered like scalars are so we can’t say which vector is greater or lesser without it. The most common metric is the magnitude ($ g = \left | \mathbf{f} \right | $). Once we have that we just use the above methods to find optima.

Though we can’t find extreme points directly on the vector-valued function, we can still find stationary points. For univariates ($f: \mathbb{R} \rightarrow \mathbb{R}^m $) we can take the derivative:

$$ f'(t) = (f_1'(t), … , f_n'(t)) $$

and find stationary points where the derivatives are all zero.

For multivariates ($ f: \mathbb{R}^n \rightarrow \mathbb{R}^m $) we can analogously apply the Jacobian Matrix which consists of every partial of every component of the vector-valued function.

$$ \mathbf{J}(f)=\partial_{x_j}f_i = \begin{bmatrix} \partial_{x_1}\mathbf{f} & … & \partial_{x_n}\mathbf{f} \end{bmatrix} $$

The stationary point will be wherever the rank of the Jacobian is not maximal.