Geometric view of matrices, Diagonalization and SVD

11 minute read

Published:

I’ve found very few notes on the internet which talks about geometric intuition about matrices and linear algebra and wanted to write about if I ever found any good material. This blogpost talks about linear transformation, geometric intuitions of matrices, change of basis, eigen decomposition and singular value decomposition.

Linear mapping

Vectors are objects that can added together multiplied by a scalar and the resulting object is still a vector. Consider two vector spaces . A mapping preserves the structure of vector space if for all and .

\[ \begin{aligned} \phi(x+y) &= \phi(x) + \phi(y) \newline \phi(\alpha x) &= \alpha \phi(x) \end{aligned} \]

This mapping is also called linear mapping. It can be summarized as

We can represent linear mappings/transformations as matrices. Consider a mapping , where are arbitrary sets, Then the mapping is called

  • Injective if
  • Surjective if $\phi(V) = W$
  • Bijective if both injective and surjective

The following are special cases of linear mappings between vector spaces and :

  • Isomorphism if is linear and bijective
  • Endomorphism if is linear.
  • Automorphism if is linear and bijective.

Consider the vector spaces with corresponding ordered basis and . Consider that is a linear transformation such that for ,

is the unique representation of w.r.to . We call the matrix given by

as the transformation matrix of (w.r.to the ordered basis of and of ). If is the representation of vector w.r.to and is the representation of w.r.to , then

i.e. transformation matrix can be used to map coordinate vectors w.r.to an ordered basis in to coordinates w.r.to an ordered basis in .

Let see how the transformation matrix changes if we change the ordered basis of from to and the ordered basis of from to . Let be the transformation matrix w.r.to the new basis, i.e.

are the representations of w.r.to new basis of . Therefore

substituting (\ref{eq3}), (\ref{eq4}) in (\ref{eq5})

\[ \begin{aligned} T\hat{y} &= A_{\phi} S \hat{x} \newline \hat{y} &= T^{-1}A_{\phi}S \hat{x} \end{aligned} \]

comparing the above equation to (\ref{eq5}) gives

Here, is the transformation matrix of that maps coordinates with respect to onto coordinates with respect to B and is the transformation matrix of that maps coordinates with respect to onto coordinates with respect to C.

The transformation matrix can be interpreted as follows: First, transform coordinates w.r.to onto .Then, use transformation matrix to map coordinates onto vector space w.r.to . Finally, map the coordinates onto from . If we write down all the linear transformations, then and ,

\[ \begin{aligned} B_1 \rightarrow C_1 &= B_1 \rightarrow B \rightarrow C \rightarrow C_1 \newline \hat{A_{\phi}} &= T^{-1} A_{\phi} S \end{aligned} \]

The matrices/linear mappings and are called equivalent matrices, since they correspond to the same linear mapping, but w.r.to different basis.

Similar matrices

Two matrices are similar, if there exists a matrix , such that Similar matrices correspond to the same automorphism w.r.to different basis. Similar matrices are always equivalent, however equivalent matrices are not necessarily similar. Similar matrices have the same eigen values.

Remark: Eigen value analysis can be used to characterize a matrix and it’s associated linear mappings. Similar matrices have the same eigen values. Therefore, a linear mapping has eigenvalues that are independent of the choice of basis of its transformation matrix. This makes eigenvalues, together with the determinant and the trace, key characteristic parameters of a linear mapping as they are all invariant under basis change.

Determinant

Determinant can be thought of as the signed volume of an n-dimensional parallelopiped formed by the columns of a matrix. Similar matrices have the same determinant, thus the determinant is invariant to the change of basis of linear mapping, so does the trace of a matrix.

Let’s gain some intuition about how eigen values, vectors and determinanat using some linear mappings

 Photo

Overview of five linear mappings using 400 color coded points in $\mathbb{R}^2$(left column) onto target points in $\mathbb{R}^2$(right column) using mapping $A \in \mathbb{R}^{2 \times 2}$. The central column depicts eigen vectors streched by its corresponding eigen value. All transformations here are w.r.to the standard basis.

  1. Two eigen vectors corresponding to standard basis in . Vertical one stretched twice its size() and the horizontal one made half it’s size (). So $det(A) = 1$, the mapping is area preserving.
  2. Both eigen values are same and eigen vectors are collinear (along horizontal axis). So, mapping only works horizontally. Also area preserving
  3. Eigen values are imaginary, indicating that the transformation is rotation (there is no direction along which values simply get stretched/shrinked). Also area preserving
  4. One of the eigen values is zero, so the space/points in the direction of eigen vector corresponds to eigen value 0 collapses and hence the mapping maps 2d points along a single direction (the other eigen vector). Area of mapping is zero
  5. Two eigen values . Scales the space along one vector by a factor of 0.5 and along the other by a factor of 1.5. Since , mapping scales the space by 75%

Diagonalization

A matrix is diagonalizable if it is similar to a diagonal matrix ,i.e. if there exists a invertible matrix such that

  • It can be shown that the matrix is a matrix formed by eigen vectors of as columns.
  • Since the matrix has to be invertible, only matrices with linearly independent eigen vectors can be diagonalized.
  • If a square matrix has less than linearly independent eigen vectors, the matrix is said to be defective. Defective matrices can not be diagonalized.
  • A square symmetric matrix is always diagonalizable, because it has the eigen vectors that can form the basis of . Furthermore, eigen vectors are orthogonal and eigen values are real.

Diagonalization/Eigen decomposition can be geometrically interpreted as follows.

  • performs basis change from standard basis to eigen basis. This identifies the eigen vectors onto the standard basis vectors
  • scales these vectors along the axes by
  • Finally, transforms these scaled vectors back to standard coordinates yielding .

 Photo

Geometric intuition of eigen decomposition as sequential linear transformations. Top-left to bottom-left: Basis change, mapping the eigenvectors into the standard basis. Bottom-left to bottom-right: Scaling along the remapped orthogonal eigenvectors, depicted here by a circle being stretched to an ellipse. Bottom-right to top-right: Undoing the basis change (depicted as a reverse rotation) and restores the original coordinate frame.

The whole idea of diagonalising a matrix is to perform a basis change onto eigen space so that it’s easy to work with and then revert back to standard basis after performing the mapping.

Let be a diagonalizable matrix with eigen values and corresponding eigen vectors . Now for any vector ,

  • Let , i.e. the representation of $x$ w.r.to the eigen basis, i.e.
  • Then,
  • Finally,

Diagonalizing makes it easier to raise a matrix to an integer power, i.e

Diagonalization can only be performed on square matrices. Let’s look at the general matrix decomposition technique called singular value decomposition.

Singular Value Decomposition.

SVD can be applied to all matrices and it always exists. The SVD of a matrix which represents a linear mapping quantifies the change between the underlying geometry of these two vector spaces. Let be a rectangular matrix with rank . Then, can be decomposed as

is an orthogonal matrix with column vectors and is another orthogonal matrix with column vectors and and . The diagonal entries of are called the singular values, are called the left-singular vectors, and are called the right-singular vectors. By convention, the singular values are ordered, i.e. .

Geometric view of SVD

SVD of a matrix can be interpreted as decomposition of a linear mapping into a sequential linear operations. Assume a linear transformation w.r.to standard basis B of and C of . Then

  • performs the basis change from standard basis onto right singular basis . This maps the vectors onto standard basis as .
  • is the transformation matrix of w.r.to the right singular basis(dimension n) and left singular basis(dimension m). scales the new coordinates by the corresponding singular value and adds/deletes dimensions based on the value of . If , it introduces new dimensions by embedding data into a higher dimensional space and if , it drops certain dimensions.
  • maps the coordinates back onto the standard basis of from left singular basis.

 Photo

Geometric intuition of SVD of a matrix $A\in \mathbb{R}^{3 \times 2}$ as sequential linear transformations. Top-left to bottom-left: $V^T$ performs a basis change in $\mathbb{R}^2$. Bottom-left to bottom-right: $S$ scales and maps $\mathbb{R}^2$ to $\mathbb{R}^3$, the ellipse in the bottom right lies in $\mathbb{R}^3$. Bottom-right to top-right: $U$ performs basis change in $\mathbb{R}^3$

SVD performs the basis change in both domain and co-domain, and the transformation w.r.to these new basis is represented by the singular value matrix .

Transformations one and three includes multiplication by an orthogonal matrix, hence the transformation is only rotation in steps one and three. Orthogonality preserves the lengths of vectors as well as the angle between the vectors.

Construction of SVD

Let , then . Both and are SPD (Symmetric Positive Definite) matrices, so can be eigen decomposed. Since both the matrices have rank , they have non-zero eigen values and are same. has zero eigen values and has zero eigen values

 Photo

so, diagonal elements are are eigen values of and right singular vectors are eigen vectors of . Let are eigen vectors corresponding to eigen values such that and . Let , Then

i.e. the second set of eigen vectors corresponds to null space of and since , .

 Photo

So, we have \begin{equation} Z^{-1}V^T_1 A^TAV_1Z^{-1} = \mathbf{I} \end{equation} where

 Photo

Let is an matrix, then we have showing that are orthonormal vectors in where

Similarly we can show that diagonal elements are are eigen values of and right singular vectors are eigen vectors of . We can factor where is given by the above equation, is the null space of which is of dimension (m-r).

Eigen decomposition Vs SVD

Let us consider the eigendecomposition and the SVD .

  • The SVD always exists for any matrix . The eigen decomposition is only defined for square matrices and only exists if we can find a basis of eigenv ectors of .
  • The vectors in the eigendecomposition matrix $P$ are not necessarily orthogonal, i.e., the change of basis is not a simple rotation and scaling. On the other hand, the vectors in the matrices and in the SVD are orthonormal, so they do represent rotations.
  • Both the eigendecomposition and the SVD are compositions of three linear mappings:
    • Change of basis in the domain
    • Independent scaling of each new basis vector and mapping from domain to codomain
    • Change of basis in codomain

    A key difference between the eigendecomposition and the SVD is that in the SVD, domain and codomain can be vector spaces of different dimensions.

  • In the SVD, the left and right-singular vector matrices and are generally not inverse of each other (they perform basis changes in different vector spaces). In the eigen decomposition, the basis change matrices and are inverses of each other.
  • In the SVD, the entries in the diagonal matrix are all real and non- negative, which is not generally true for the diagonal matrix in the eigen decomposition.
  • For symmetric square matrices, Eigen decomposition and SVD are one and the same.

SVD has many applications in machine learning ranging from dimensionality reduction to data compression and clustering.

References

Most of the content and all the images are borrowed from

  1. Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong, Mathematics for Machine Learning