Mastering Machine Learning Algorithms
上QQ阅读APP看书,第一时间看更新

Laplacian Spectral Embedding

This algorithm, based on the spectral decomposition of a graph Laplacian, has been proposed in order to perform a non-linear dimensionality reduction to try to preserve the nearness of points in the P-dimensional manifold when remapping on a Q-dimensional (with Q < P) subspace.

The procedure is very similar to the other algorithms. The first step is a k-nearest neighbor clustering to generate a graph where the vertices (we can assume to have N elements) are the samples, and the edges are weighted using an RBF kernel:

The resulting graph is undirected and symmetric. We can now define a pseudo-degree matrix D:

The low-dimensional representation Φ is obtained by minimizing the function:

If the two points xi and xj are near, the corresponding Wij is close to 1, while it tends to 0 when the distance tends to . Dii is the sum of all weights originating from xi (and the same for Djj). Now, let's suppose that xi is very close only to xj so, to approximate Dii = Djj ≈ Wij. The resulting formula is a square loss based on the difference between the vectors φi and φj. When instead there are multiple closeness relationships to consider, the factor Wij pided by the square root of DiiDjj allows reweighting the new distances to find the best trade-off for the whole dataset. In practice, LΦ is not minimized directly. In fact, it's possible to prove that the minimum can be obtained through the spectral decomposition of the symmetric normalized graph Laplacian (the name derives from this procedure):

Just like for the LLE algorithm, Laplacian Spectral Embedding also works with the bottom Q + 1 eigenvectors. The mathematical theory behind the last step is always based on the application of the Rayleigh-Ritz method. The last one is discarded, and the remaining Q determines the low-dimensional representation φi.