您可以使用
concat和
str.get_dummies:
print pd.concat([df['Id'], df['Field'].str.get_dummies(sep=",")], axis=1) Id Economics English-107 English-2 History Java-2 Literature 56 1 1 0 1 00 1 11 0 0 0 0 10 2 6 0 0 1 0 01 3 43 0 0 0 1 01 4 14 0 1 0 0 10 Management Mathematics Philosophy Web-development 01 01 0 10 00 1 20 00 0 30 11 0 41 00 0
如果需要计数值,则可以使用
pivot_table(我添加一个字符串
Economics进行测试):
df1 = df['Field'].str.split(',',expand=True).stack().groupby(level=0).value_counts().reset_index()df1.columns=['a','b','c']print df1.pivot_table(index='a',columns='b',values='c').fillna(0)b Economics English-107 English-2 History Java-2 Literature Management a 0 2 1 0 1 001 1 0 0 0 0 100 2 0 0 1 0 010 3 0 0 0 1 010 4 0 1 0 0 101b Mathematics Philosophy Web-development a0 01 0 1 00 1 2 00 0 3 11 0 4 00 0
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)