A matrix $0 \neq B \in \R^{m \times n}$ has rank $1$ iff $\exists x \in \R^m, y \in \R^n$ s.t $B = xy^T$.

Singular Value Decomposition

Any matrix $A \in \R^{m \times n}$ can be decomposed into an orthogonal matrix $U \in \R^{m \times m}$, containing the eigenvectors of $AA^T$, known as the left singular vectors, a diagonal matrix $\Sigma \in \R^{m \times n}$ containing the singular values of $A$, and an orthogonal matrix $V \in \R^{n \times n}$, containing the eigenvectors of $A^TA$, known as the right singular vectors.

$$ A = U\Sigma V^T $$

The diagonal matrix $\Sigma$ can be represented in block matrix form in the following way:

$$ \Sigma= \begin{bmatrix} \Sigma r & 0 {r \times n-r} \\ 0{m-r \times r} & 0{m-r \times n-r}\end{bmatrix} \\ \Sigma_r = \begin{bmatrix} \sigma_1 & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \sigma_r\end{bmatrix}, \hspace{15pt} r = \text{rank }A $$

By convention, the singular values, which are the square roots of the non-zero eigenvalues of $A^TA$ and $AA^T$, are listed out in decreasing order.

$$ \sigma_1 \geq \dots \geq \sigma_r >0 $$

PSD of $A^TA$ and $AA^T$. Both of these matrices are positive semidefinite, which can be shown by expanding out the SVD of $A$.

</aside>

Rank One Approximation. The outer product perspective of matrix multiplication allows us to approximate a matrix in terms of its singular vectors associated with the largest singular value.

$$ A= \sum _{i=1} ^r \sigma _i u^{(i)}(v^{(i)})^T \approx \sigma _1 u^{(1)} (v^{(1)})^T $$

</aside>

Moore-Penrose Pseudoinverse

The Moore-Penrose Pseudoinverse of a matrix $A\in \R^{m \times n}$ produces the solution of least norm to the system of equations $Ax = y$.

$$ A^\dagger = V \Sigma ^\dagger U^T \text{ with } \Sigma ^\dagger \begin{bmatrix} \Sigma r ^{-1} & 0 {r \times n-r} \\ 0{m-r \times r} & 0{m-r \times n-r}\end{bmatrix} \\

A^{\dagger} y = \argmin _{x \in \R^n} ||x|| \hspace{10pt} \text{s.t. } x\in \mathcal{S} $$

To see why this is the case, recall that the solution set to $Ax=y$, assuming $y \in \text{range } A$, has the form $\mathcal{S} = \{ x^* \space | \space A^TA x^* = A^Ty\}$, which can be written as $\mathcal{S} = x_0 + \text{null }A$. Moreover, $A^T(AA^T)^{-1}y$ is the least norm solution, which is exactly $A^{\dagger} y$ when $A$ has full row rank. Hence, we can write $\mathcal{S} = A^{\dagger}y + \text{null }A$.

Singular Value Decomposition

Moore-Penrose Pseudoinverse

Matrix Norms