您可以使用IIUC
read_csv和
merge:
import pandas as pdimport iotemp1=u"""Sample;Chr;Start;End;ValueS1;1;100;200;1S1;2;200;250;1S2;1;50;75;5S2;2;150;225;4"""#after testing replace io.StringIO(temp1) to filenamedfline = pd.read_csv(io.StringIO(temp1), sep=";")temp2=u"""Name;Chr;PositionP1;1;105P2;1;60P3;1;500P4;2;25P5;2;220P6;2;240"""#after testing replace io.StringIO(temp2) to filenamemapfile = pd.read_csv(io.StringIO(temp2), sep=";")print dfline Sample Chr Start End Value0 S1 1 100 200 11 S1 2 200 250 12 S2 1 50 75 53 S2 2 150 225 4print mapfile Name Chr Position0 P1 1 1051 P2 1 602 P3 1 5003 P4 2 254 P5 2 2205 P6 2 240#merge by column Chrdf = pd.merge(dfline, mapfile, on=['Chr'])#select by conditionsdf = df[(df.Position > df.Start) & (df.Position < df.End)]#subset of dfdf = df[['Name','Chr','Position','Value', 'Sample']]print df Name Chr Position Value Sample0 P1 1 105 1 S14 P2 1 60 5 S27 P5 2 220 1 S18 P6 2 240 1 S110 P5 2 220 4 S2#if you need reset indexprint df.reset_index(drop=True) Name Chr Position Value Sample0 P1 1 105 1 S11 P2 1 60 5 S22 P5 2 220 1 S13 P6 2 240 1 S14 P5 2 220 4 S2
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)