似乎您的往返IS导致了一些unipre。不知道为什么会这样,但是很容易解决。您无法将unipre存储在python2的HDFStore表中(但是在python3中可以正常工作)。如果需要,您可以将其作为固定格式(将被腌制)。看这里。
In [33]: df = pd.read_json(s)In [25]: dfOut[25]: args date host kwargs operation status thingy time0 [] 2013-12-02 00:33:59 yy38.segm1.org {} x_gbinf -101 a13yy38 0.0008011 [] 2013-12-02 00:33:59 kyy1.segm1.org {} x_initobj 1 a19kyy1 0.0032442 [] 2013-12-02 00:34:00 yy10.segm1.org {} x_gobjParams -101 a14yy10 0.0022473 [] 2013-12-02 00:34:00 yy24.segm1.org {} gtfull -101 a14yy24 0.0027874 [] 2013-12-02 00:34:00 yy24.segm1.org {} x_gbinf -101 a14yy24 0.0010675 [] 2013-12-02 00:34:00 yy34.segm1.org {} gxyzinf -101 a12yy34 0.0026526 [] 2013-12-02 00:34:00 yy15.segm1.org {} deletemfg 1 a15yy15 0.0043717 [] 2013-12-02 00:34:00 yy15.segm1.org {} gxyzinf -101 a15yy15 0.000602[8 rows x 8 columns]In [26]: df.dtypesOut[26]: args objectdate datetime64[ns]host objectkwargs objectoperation objectstatus int64thingy objecttime float64dtype: object
这推断出
objectdtyped
Series的实际类型。仅当至少1个字符串为unipre时,它们才会以unipre的形式出现(否则它们将被推断为string)
In [27]: df.apply(lambda x: pd.lib.infer_dtype(x.values))Out[27]: args unipredate datetime64host uniprekwargs unipreoperation uniprestatus integerthingy unipretimefloatingdtype: object
这是“修复”它的方法
In [28]: types = df.apply(lambda x: pd.lib.infer_dtype(x.values))In [29]: types[types=='unipre']Out[29]: args uniprehost uniprekwargs unipreoperation uniprethingy unipredtype: objectIn [30]: for col in types[types=='unipre'].index: ....: df[col] = df[col].astype(str) ....:
看起来一样
In [31]: dfOut[31]: args date host kwargs operation status thingy time0 [] 2013-12-02 00:33:59 yy38.segm1.org {} x_gbinf -101 a13yy38 0.0008011 [] 2013-12-02 00:33:59 kyy1.segm1.org {} x_initobj 1 a19kyy1 0.0032442 [] 2013-12-02 00:34:00 yy10.segm1.org {} x_gobjParams -101 a14yy10 0.0022473 [] 2013-12-02 00:34:00 yy24.segm1.org {} gtfull -101 a14yy24 0.0027874 [] 2013-12-02 00:34:00 yy24.segm1.org {} x_gbinf -101 a14yy24 0.0010675 [] 2013-12-02 00:34:00 yy34.segm1.org {} gxyzinf -101 a12yy34 0.0026526 [] 2013-12-02 00:34:00 yy15.segm1.org {} deletemfg 1 a15yy15 0.0043717 [] 2013-12-02 00:34:00 yy15.segm1.org {} gxyzinf -101 a15yy15 0.000602[8 rows x 8 columns]
但现在可以正确推断。
In [32]: df.apply(lambda x: pd.lib.infer_dtype(x.values))Out[32]: args stringdate datetime64host stringkwargsstringoperation stringstatus integerthingystringtimefloatingdtype: object
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)