>>> df A B0 1 Ms1 1 Ms2 1 Ms3 1 Ms4 1 PhD5 2 Ms6 2 Ms7 2 Bs8 2 PhD
def sort_df(df, column_idx, key): '''Takes a dataframe, a column index and a custom function for sorting, returns a dataframe sorted by that column using that function''' col = df.ix[:,column_idx] df = df.ix[[i[1] for i in sorted(zip(col,range(len(col))), key=key)]] return df
我们的排序功能:
cmp = lambda x:2 if 'PhD' in x else 1 if 'Bs' in x else 0
实际上:
sort_df(df,'B',cmp).drop_duplicates('A', take_last=True)
A B4 1 PhD8 2 PhD
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)