$$ \nabla (f \circ g) (v) = \begin{bmatrix} \nabla g_1 (v) & \dots & \nabla g_m (v)\end{bmatrix} \nabla f(g(v)) $$
$$ f(x) = f(x_0) + \nabla f(x_0 )^T (x-x_0) + \mathcal{E}(x) \approx f(x_0) + \nabla f(x_0)^T (x-x_0) $$
The gradient at a local minimum is $0$.
Rank theorems.
$$ 0 \leq \text{rank }A \leq \min(m,n ) \text{ and } \text{rank }A = 0 \iff A = 0 \\ \text{null } A = (\text{range }A^T) ^\perp \text{ so } \text{null } A \oplus (\text{range }A^T) ^\perp = F^n \\ \dim \text{null } A + \text{rank }A = n $$
If a wide matrix has linearly independent rows, then its solution set $\mathcal{S}$ to $Ax = y$ is affine wrt a particular solution $x_0$.
$$ \mathcal{S} = x_0 + \text{null }A $$
Moore-Penrose Pseudoinverse (Full Row Rank). Obtains the solution of least norm to $Ax = y$.
$$ x^* = A^T (AA^T)^{-1}y $$
Moore-Penrose Pseudoinverse (Full Column Rank). Obtains the least squares solution to $Ax = y$.
$$ x^* = (A^T A)^{-1}A^T y $$
$$ \mathcal{S} = \{x^* \in \R^n \space | \space A^T Ax^* = A^T y\} $$
If $A$ has full column rank, then $A^TA$ is invertible. If $A$ has full row rank, then $AA^T$ is invertible.
A hyperplane $H$ is an $n-1$-dimensional affine set, typically embedded in an $n$-dimensional vector space $\mathbf{V}$, defined with respect to a normal vector $0 \neq a \in \mathbf{V}$ and $b \in \mathbf{V}$.