好的,这个问题在一定程度上受到了启发:将可变数目的列转换为二进制矩阵
因此,请阅读csv,但将分隔符覆盖到选项卡上,这样它就不会尝试拆分名称:
In[7]:import pandas as pdimport iot="""Anne,Beth,Caroline,Ernie,Frank,HannahBeth,Caroline,David,ErnieCaroline,HannahDavid,,Anne,Beth,Caroline,ErnieErnie,Anne,Beth,Frank,GeorgeFrank,Anne,Caroline,HannahGeorge,Hannah,Anne,Beth,Caroline,David,Ernie,Frank,George"""df = pd.read_csv(io.StringIO(t), sep='t', header=None)dfOut[7]: 00 Anne,Beth,Caroline,Ernie,Frank,Hannah1 Beth,Caroline,David,Ernie2 Caroline,Hannah3 David,,Anne,Beth,Caroline,Ernie4 Ernie,Anne,Beth,Frank,George5 Frank,Anne,Caroline,Hannah6George,7 Hannah,Anne,Beth,Caroline,David,Ernie,Frank,Ge...
现在
str.split,我们可以使用with
expand=True将名称扩展到自己的列中:
In[8]:df[0].str.split(',', expand=True)Out[8]:0 1 2 3 4 5 6 70 Anne Beth Caroline Ernie Frank Hannah None None1 Beth Caroline David Ernie None None None None2 Caroline Hannah None None None None None None3 David Anne Beth Caroline Ernie None None4 Ernie Anne Beth Frank George None None None5 Frank Anne Caroline Hannah None None None None6 George None None None None None None7 Hannah Anne Beth Caroline David Ernie Frank George
因此,为了清楚起见,将您的
read_csv行修改为此:
df = pd.read_csv(infile, header=None, sep='t')
然后做
str.split上面的
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)