Mastering Machine Learning Algorithms
上QQ阅读APP看书,第一时间看更新

Example of Laplacian Spectral Embedding

Let's apply this algorithm to the same dataset using the Scikit-Learn class SpectralEmbedding, with n_components=2 and n_neighbors=15:

from sklearn.manifold import SpectralEmbedding

se = SpectralEmbedding(n_components=2, n_neighbors=15)
X_se = se.fit_transform(faces['data'])

The resulting plot (zoomed in due to the presence of a high-density region) is shown in the following graph:

Laplacian Spectral Embedding applied to the Olivetti faces dataset

Even in this case, we can see that some classes are grouped into small clusters, but at the same time, we observe many agglomerates where there are mixed samples. Both this and the previous method work with local pieces of information, trying to find low-dimensional representations that could preserve the geometrical structure of micro-features. This condition drives to a mapping where close points share local features (this is almost always true for images, but it's very difficult to prove for generic samples). Therefore, we can observe small clusters containing elements belonging to the same class, but also some apparent outliers, which, on the original manifold, can be globally different even if they share local patches. Instead, methods like Isomap or t-SNE work with the whole distribution, and try to determine a representation that is almost isometric with the original dataset considering its global properties.