我认为需要
set_index有
unstack:
df['COUNT'] = df.groupby(['ID','col']).cumcount()+1df = df.set_index(['ID','col', 'COUNT'])['col2'].unstack().add_prefix('col').reset_index()print (df)COUNT ID col col1 col20 1 A 50.0 52.01 1 B 45.0 NaN2 1 C 18.0 NaN
要么:
c = df.groupby(['ID','col']).cumcount()+1df = df.set_index(['ID','col', c])['col2'].unstack().add_prefix('col').reset_index()print (df) ID col col1 col20 1 A 50.0 52.01 1 B 45.0 NaN2 1 C 18.0 NaN
编辑:
对于多列,解决方案有所更改,因为在
MultiIndexin列中进行处理:
df['COUNT'] = (df.groupby(['ID','col']).cumcount()+1).astype(str)#remove col2df = df.set_index(['ID','col', 'COUNT']).unstack()#flatten Multiindexdf.columns = df.columns.map('_'.join)df = df.reset_index()print (df) ID col col2_1 col2_2 col3_1 col3_2 col4_1 col4_20 1 A 50.0 52.0 S M 1.0 4.01 1 B 45.0 NaN N None 8.0 NaN2 1 C 18.0 NaN S None 7.0 NaN
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)