Single Value Decomposition (SVD) is a method to factorize or decompose a matrix. Let’s say we view , a matrix whose dimension is , as a transform or mapping from which belongs to to which belongs to , i.e.:
SVD states:
and orthogonal, unitary matrices can be viewed as two rotational operands, and the can be viewed as scaling factors along each eigenvector in . But why two rotations and one scaling? In practice, are usually associated with some meaningful properties—for example, features and items. We are really interested in the relationship between and , defined by . SVD is used to simplify/clarify the relationship. For example, can be viewed as the projections to space, sorted by eigenvalues given in . More specifically, if we examine one of the eigenvector-value-vector trio in SVD:
here can be view as a mode or pattern in , with describes how important it is. describes its distribution along and for . The concept of mode is borrowed from vibration analysis. If you hit a metal bar with a beam, it will vibrate. It may look messy, but the vibrations can be decomposed into a superposition of different modes. The eigenvalues and eigenvectors for each mode are associated with how fast it vibrates and how the displacement looks.
Now let’s check a concrete/everyday example; lets say we have a that records your friends’{Tim, Jeff,Steve} preferences for {1,…,5} resturants, on the scale of :
It may look like an extreme example, but let’s bear it for a moment and look at the SVD results, particularly . We can see that has a larger weight than the other restaurants, and is more significant in . So the mode/pattern we can conclude here is Tim likes the first two restaurants . It’s evident, of course; we can even tell by just staring at the original table .
However, it is more complicated in real-life data, and we can use SVD to help us draw a similar conclusion. For example, a specific group of people likes a particular group of restaurants, and it turns out they are vegans, salad bars. Now lets check the rest of the eigenvalues are 38 and 1.4, that means accounts for 77% of the eigenvalues. If we just calculate , it turns out to be the following, which is already very close to the original table/dataset. As you can imagine, SVD can be used for data compression.
Matlab Code
A = [90,10,5;
90,10,5;
20,20,5;
10,20,5;
10,30,5];
[U,S,V] = svd(A)
k = 1;
U1 = U(:,k)
V1 = V(:,k)
U(:,k)*S(k,k)*V(:,k)'