Unsupervised learning notes

Unsupervised learning notes,第1张

Unsupervised Learning Dimensionality Reduction
  1. PCA

Find the features that capture most of the data points.

(https://www.youtube.com/watch?v=HMOI_lkzW08)

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()
(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)

# Before applying PCA, each feature should be centered (zero mean) and with unit variance
X_normalized = StandardScaler().fit(X_cancer).transform(X_cancer)  

pca = PCA(n_components = 2).fit(X_normalized)

X_pca = pca.transform(X_normalized)
  1. Manifold Learning

Multidimensional scaling (MDS) attempts to find a distance-preserving low dimensional projection.

t-SNE finds a 2D projection preserving information about neighbours. (Also use the distance from the high dimension and project to 2D, but focus on distance between neighbours)

(https://distill.pub/2016/misread-tsne/#citation)

from adspy_shared_utilities import plot_labelled_scatter
from sklearn.preprocessing import StandardScaler
from sklearn.manifold import MDS

# each feature should be centered (zero mean) and with unit variance
X_fruits_normalized = StandardScaler().fit(X_fruits).transform(X_fruits)  

mds = MDS(n_components = 2)

X_fruits_mds = mds.fit_transform(X_fruits_normalized)


from sklearn.manifold import TSNE

tsne = TSNE(random_state = 0)

X_tsne = tsne.fit_transform(X_fruits_normalized)

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/739522.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-04-28
下一篇 2022-04-28

发表评论

登录后才能评论

评论列表(0条)

保存