看来你需要
read_csv为
Dataframe第一与过滤器仅第二和第三列,然后再转换为numpy的阵列由
values:进口大熊猫作为PD从pandas.compat进口StringIO的sklearn.cluster进口KMEANS
temp=u"""col,iid,rat4,1,05,2,46,3,37,4,1"""#after testing replace 'StringIO(temp)' to 'filename.csv'df = pd.read_csv(StringIO(temp), usecols = [1,2])print (df) iid rat0 1 01 2 42 3 33 4 1X = df.values print (X)[[1 0] [2 4] [3 3] [4 1]]kmeans = KMeans(n_clusters=2)a = kmeans.fit(X)print (a)KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300, n_clusters=2, n_init=10, n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001, verbose=0)
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)