Python Pandas-比较2个数据框，多个参数_随笔

Python Pandas-比较2个数据框，多个参数

您可以使用IIUC

read_csv

和

merge

：

import pandas as pdimport iotemp1=u"""Sample;Chr;Start;End;ValueS1;1;100;200;1S1;2;200;250;1S2;1;50;75;5S2;2;150;225;4"""#after testing replace io.StringIO(temp1) to filenamedfline = pd.read_csv(io.StringIO(temp1), sep=";")temp2=u"""Name;Chr;PositionP1;1;105P2;1;60P3;1;500P4;2;25P5;2;220P6;2;240"""#after testing replace io.StringIO(temp2) to filenamemapfile = pd.read_csv(io.StringIO(temp2), sep=";")print dfline  Sample  Chr  Start  End  Value0     S1    1    100  200      11     S1    2    200  250      12     S2    1     50   75      53     S2    2    150  225      4print mapfile  Name  Chr  Position0   P1    1       1051   P2    1        602   P3    1       5003   P4    2        254   P5    2       2205   P6    2       240#merge by column Chrdf = pd.merge(dfline, mapfile, on=['Chr'])#select by conditionsdf = df[(df.Position > df.Start) & (df.Position < df.End)]#subset of dfdf =  df[['Name','Chr','Position','Value', 'Sample']]print df   Name  Chr  Position  Value Sample0    P1    1       105      1     S14    P2    1        60      5     S27    P5    2       220      1     S18    P6    2       240      1     S110   P5    2       220      4     S2#if you need reset indexprint df.reset_index(drop=True)  Name  Chr  Position  Value Sample0   P1    1       105      1     S11   P2    1        60      5     S22   P5    2       220      1     S13   P6    2       240      1     S14   P5    2       220      4     S2

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5662316.html

Python Pandas-比较2个数据框，多个参数

发表评论

评论列表（0条）