to_datetime接受格式字符串:
In [92]:t = 20070530pd.to_datetime(str(t), format='%Y%m%d')Out[92]:Timestamp('2007-05-30 00:00:00')
例:
In [94]:t = 20070530df = pd.Dataframe({'date':[t]*10})dfOut[94]: date0 200705301 200705302 200705303 200705304 200705305 200705306 200705307 200705308 200705309 20070530In [98]:df['DateTime'] = df['date'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))dfOut[98]: date DateTime0 20070530 2007-05-301 20070530 2007-05-302 20070530 2007-05-303 20070530 2007-05-304 20070530 2007-05-305 20070530 2007-05-306 20070530 2007-05-307 20070530 2007-05-308 20070530 2007-05-309 20070530 2007-05-30In [99]:df.dtypesOut[99]:date int64DateTime datetime64[ns]dtype: object
编辑
实际上,将类型转换为字符串然后将整个系列转换为日期时间要快得多,而不是对每个值调用apply:
In [102]:df['DateTime'] = pd.to_datetime(df['date'].astype(str), format='%Y%m%d')dfOut[102]: date DateTime0 20070530 2007-05-301 20070530 2007-05-302 20070530 2007-05-303 20070530 2007-05-304 20070530 2007-05-305 20070530 2007-05-306 20070530 2007-05-307 20070530 2007-05-308 20070530 2007-05-309 20070530 2007-05-30
时机
In [104]:%timeit df['date'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))100 loops, best of 3: 2.55 ms per loopIn [105]:%timeit pd.to_datetime(df['date'].astype(str), format='%Y%m%d')1000 loops, best of 3: 396 µs per loop
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)