您可以使用以下
groupby-agg*** 作:
In [38]: result = df.groupby(['a'], as_index=False).agg({'c':['mean','std'],'b':'first', 'd':'first'})
然后重命名各列并对其重新排序:
In [39]: result.columns = ['a','c','e','b','d']In [40]: result.reindex(columns=sorted(result.columns))Out[40]: a b c d e0 Apple 3 4.5 7 0.7071071 Banana 4 4.0 8 NaN2 Cherry 7 1.0 3 NaN
熊猫默认情况下会计算样本std。要计算总体标准:
def pop_std(x): return x.std(ddof=0)result = df.groupby(['a'], as_index=False).agg({'c':['mean',pop_std],'b':'first', 'd':'first'})result.columns = ['a','c','e','b','d']result.reindex(columns=sorted(result.columns))
产量
a b c d e0 Apple 3 4.5 7 0.51 Banana 4 4.0 8 0.02 Cherry 7 1.0 3 0.0
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)