比较Pandas DataFrame中的上一行值_随笔

比较Pandas DataFrame中的上一行值

您需要

eq

使用

shift

：

df['match'] = df.col1.eq(df.col1.shift())print (df)   col1  match0     1  False1     3  False2     3   True3     1  False4     2  False5     3  False6     2  False7     2   True

或改为

eq

使用

==

，但是在大型Dataframe中，它会稍微慢一些：

df['match'] = df.col1 == df.col1.shift()print (df)   col1  match0     1  False1     3  False2     3   True3     1  False4     2  False5     3  False6     2  False7     2   True

时间：

import pandas as pddata={'col1':[1,3,3,1,2,3,2,2]}df=pd.Dataframe(data,columns=['col1'])print (df)#[80000 rows x 1 columns]df = pd.concat([df]*10000).reset_index(drop=True)df['match'] = df.col1 == df.col1.shift()df['match1'] = df.col1.eq(df.col1.shift())print (df)In [208]: %timeit df.col1.eq(df.col1.shift())The slowest run took 4.83 times longer than the fastest. This could mean that an intermediate result is being cached.1000 loops, best of 3: 933 µs per loopIn [209]: %timeit df.col1 == df.col1.shift()1000 loops, best of 3: 1 ms per loop

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5643615.html

比较Pandas DataFrame中的上一行值

发表评论

评论列表（0条）