Python sklearn的KMeans与k

Python sklearn的KMeans与k,第1张

Python sklearn的KMeans与k
一、
K均值聚类为Unsupervised learning,默认使用欧氏距离。

from sklearn.cluster import KMeans,k_means

 第1.cluster.KMeans([n_clusters, init, n_init, ...])                      K均值聚类

 第2.cluster.k_means(X, n_clusters[, init, ...])                           K均值聚类算法

(1)从目的和源码来看:因为它们目的不一样

第2:k_means为K均值聚类算法只是对数据集进行k簇聚类(即为还原K均值聚类算法),所以要在k_means(X, n_clusters[, init, ...])里直接输入X(数据集)

第1:KMeans为K均值聚类,先在KMeans([n_clusters, init, n_init, ...])确定k簇聚类,然后KMeans源码内含子方法fit(计算k - means聚类)与predict(预测X中每个样本所属的最接近的群集),所以可以说它升华成了一个无监督的machine learning model

二、实例

数据集为

1.

from sklearn.cluster import KMeans
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

data = pd.read_csv("C:/Users/CWY/Desktop/deeplearn/Personalized-recommend-master/test/three_class_data.csv")

x = data[["x","y"]]
model = KMeans(n_clusters=3)
model.fit(x)
x_min, x_max = data['x'].min() - 1, data['x'].max() + 1
y_min, y_max = data['y'].min() - 1, data['y'].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, .01), np.arange(y_min, y_max, .01))
result = model.predict(np.c_[xx.ravel(), yy.ravel()])
result = result.reshape(xx.shape)
plt.contourf(xx, yy, result, cmap=plt.cm.Greens)
plt.scatter(data['x'], data['y'], c=model.labels_, s=15)
center = model.cluster_centers_
plt.scatter(center[:, 0], center[:, 1], marker='p', linewidths=2, color='b', edgecolors='w', zorder=20)
plt.show()

2.

from sklearn.cluster import k_means
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

data = pd.read_csv("C:/Users/CWY/Desktop/deeplearn/Personalized-recommend-master/test/three_class_data.csv")
x = data[["x", "y"]]
model = k_means(x, n_clusters=3)
plt.scatter(data['x'], data['y'], c=model[1])
plt.show()

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/4678961.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-11-06
下一篇 2022-11-07

发表评论

登录后才能评论

评论列表(0条)

保存