Miscellaneous Review

Any linear functional $f : V \rightarrow F$ can be represented s.t $f(v) = \lang a \space | \space v\rang$ for some $a \in V$. If $\mathbf{V} = \R^n$, then it follows that $f(v) = a^T v$.
- Proof
The gradient of $a^T v$ is $a$.
An affine function $f : V \rightarrow F$ is a function such that $g(v) = f(v) - f(0)$ is linear. From the above property, this allows us to represent any affine function in $\R^n$ as $f(v) = a^T v + f(0)$.
The chain rule for gradients is as follows,

$$ \nabla (f \circ g) (v) = \begin{bmatrix} \nabla g_1 (v) & \dots & \nabla g_m (v)\end{bmatrix} \nabla f(g(v)) $$

The first order Taylor approximation is an affine function that approximates a function $f$. The closer we are to $x_0$, the better the approximation, since $\mathcal{E}$ is a polynomial in $x-x_0$.

$$ f(x) = f(x_0) + \nabla f(x_0 )^T (x-x_0) + \mathcal{E}(x) \approx f(x_0) + \nabla f(x_0)^T (x-x_0) $$

The gradient at a local minimum is $0$.
- Proof
Rank theorems.

$$ 0 \leq \text{rank }A \leq \min(m,n ) \text{ and } \text{rank }A = 0 \iff A = 0 \\ \text{null } A = (\text{range }A^T) ^\perp \text{ so } \text{null } A \oplus (\text{range }A^T) ^\perp = F^n \\ \dim \text{null } A + \text{rank }A = n $$
If a wide matrix has linearly independent rows, then its solution set $\mathcal{S}$ to $Ax = y$ is affine wrt a particular solution $x_0$.

$$ \mathcal{S} = x_0 + \text{null }A $$

Moore-Penrose Pseudoinverse (Full Row Rank). Obtains the solution of least norm to $Ax = y$.

$$ x^* = A^T (AA^T)^{-1}y $$
- Proof
Moore-Penrose Pseudoinverse (Full Column Rank). Obtains the least squares solution to $Ax = y$.

$$ x^* = (A^T A)^{-1}A^T y $$

$$ \mathcal{S} = \{x^* \in \R^n \space | \space A^T Ax^* = A^T y\} $$
- Proof (linear algebra)
- Proof (calculus)
If $A$ has full column rank, then $A^TA$ is invertible. If $A$ has full row rank, then $AA^T$ is invertible.

Hyperplane

A hyperplane $H$ is an $n-1$-dimensional affine set, typically embedded in an $n$-dimensional vector space $\mathbf{V}$, defined with respect to a normal vector $0 \neq a \in \mathbf{V}$ and $b \in \mathbf{V}$.