这是使用
numpy.repeat和的一种方法
itertools.chain。从概念上讲,这正是您想要做的:重复某些值,链接其他值。建议用于少量的列,否则
stack基于方法的方法可能会更好。
import numpy as npfrom itertools import chain# return list from series of comma-separated stringsdef chainer(s): return list(chain.from_iterable(s.str.split(',')))# calculate lengths of splitslens = df['package'].str.split(',').map(len)# create new dataframe, repeating or chaining as appropriateres = pd.Dataframe({'order_id': np.repeat(df['order_id'], lens), 'order_date': np.repeat(df['order_date'], lens), 'package': chainer(df['package']), 'package_pre': chainer(df['package_pre'])})print(res) order_id order_date package package_pre0 1 20/5/2018 p1 #1110 1 20/5/2018 p2 #2220 1 20/5/2018 p3 #3331 3 22/5/2018 p4 #4442 7 23/5/2018 p5 #5552 7 23/5/2018 p6 #666
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)