Intuitive Understanding of Single Value Decomposition (SVD)

Single Value Decomposition (SVD) is a method to factorize or decompose a matrix. Let’s say we view A, a matrix whose dimension is n×m, as a transform or mapping from x which belongs to Rn to y which belongs to Rm, i.e.:

y=Ax

SVD states:

A=UΣVT

U and VT orthogonal, unitary matrices can be viewed as two rotational operands, and the Σ can be viewed as scaling factors along each eigenvector in VT. But why two rotations and one scaling? In practice, x,y are usually associated with some meaningful properties—for example, features and items. We are really interested in the relationship between x and y, defined by A. SVD is used to simplify/clarify the relationship. For example, VT can be viewed as the projections to x space, sorted by eigenvalues given in Σ. More specifically, if we examine one of the eigenvector-value-vector trio in SVD:

UΣVT=[u1un][λ1λm0000][v1TvmT] =[λ1u1λmum][v1TvmT]=Σ(uiλiviT)

uiλiviT here can be view as a mode or pattern in A, with λ1 describes how important it is. ui describes its distribution along Rn and vi for Rm. The concept of mode is borrowed from vibration analysis. If you hit a metal bar with a beam, it will vibrate. It may look messy, but the vibrations can be decomposed into a superposition of different modes. The eigenvalues and eigenvectors for each mode are associated with how fast it vibrates and how the displacement looks.

Now let’s check a concrete/everyday example; lets say we have a A that records your m friends’{Tim, Jeff,Steve} preferences for n{1,…,5} resturants, on the scale of [0,100]:

A=[90101090105202051020510305]

It may look like an extreme example, but let’s bear it for a moment and look at the SVD results, particularly u1λ1v1T. We can see that u1[0,1] has a larger weight than the other restaurants, and v1[0] is more significant in v1. So the mode/pattern we can conclude here is Tim likes the first two restaurants . It’s evident, of course; we can even tell by just staring at the original table A.

σ1=131.7288,u1=[0.68700.68700.17810.10350.1167],v1=[0.98250.17360.0673]

However, it is more complicated in real-life data, and we can use SVD to help us draw a similar conclusion. For example, a specific group of people likes a particular group of restaurants, and it turns out they are vegans, salad bars. Now lets check the rest of the eigenvalues are 38 and 1.4, that means λ1 accounts for 77% of the eigenvalues. If we just calculate u1λ1v1T, it turns out to be the following, which is already very close to the original table/dataset. As you can imagine, SVD can be used for data compression.

uiλiviT=A(λ1(77%))=[88.915688.91562341.523401521.5]

Matlab Code

A = [90,10,5;
    90,10,5;
    20,20,5;
    10,20,5;
    10,30,5];
[U,S,V] = svd(A)
k = 1;
U1 = U(:,k)
V1 = V(:,k)
U(:,k)*S(k,k)*V(:,k)'
comments powered by Disqus