之所以要添加具有后缀’_x’和’_y’的其他列,是因为要合并的列没有匹配的值,因此此冲突会产生其他列。在这种情况下,您需要删除其他“ _y”列并重命名“
In [145]:# define our drop functiondef drop_y(df): # list comprehension of the cols that end with '_y' to_drop = [x for x in df if x.endswith('_y')] df.drop(to_drop, axis=1, inplace=True)drop_y(merged)mergedOut[145]: key dept_name_x res_name_x year_x need holding DeptA_ResA_2015 DeptA ResA 2015 1 1 1 DeptA_ResA_2016 DeptA ResA 2016 1 1 2 DeptA_ResA_2017 DeptA ResA 2017 1 1 no_of_inv inv_cost_wo_ice 011000000 10 0 20 0 In [146]:# func to rename '_x' colsdef rename_x(df): for col in df: if col.endswith('_x'): df.rename(columns={col:col.rstrip('_x')}, inplace=True)rename_x(merged)mergedOut[146]: key dept_name res_name year need holding no_of_inv DeptA_ResA_2015 DeptA ResA 2015 1 11 1 DeptA_ResA_2016 DeptA ResA 2016 1 10 2 DeptA_ResA_2017 DeptA ResA 2017 1 10 inv_cost_wo_ice 01000000 1 0 2 0
编辑 如果将公用列添加到合并中,则除非这些列上的匹配项不匹配,否则不应产生重复的列:
merge_df = pd.merge(holding_df, invest_df, on=['key', 'dept_name', 'res_name', 'year'], how='left').fillna(0)