pandas pytable:如何指定MultiIndex元素的min_itemsize

pandas pytable:如何指定MultiIndex元素的min_itemsize,第1张

pandas pytable:如何指定MultiIndex元素的min_itemsize

您需要指定要为其设置的多索引级别的名称

min_itemsize
。这是一个例子:

创建2个多索引框架

In [1]: df1 = Dataframe(np.random.randn(4,2),index=MultiIndex.from_product([['abcdefghijklm','foo'],[1,2]],names=['string','number']))In [2]: df2 = Dataframe(np.random.randn(4,2),index=MultiIndex.from_product([['abcdefghijklmop','foo'],[1,2]],names=['string','number']))In [3]: df1Out[3]:        0         1string        number         abcdefghijklm 1       0.737976  0.840718   2       0.605763  1.797398foo1       1.589278  0.104186   2       0.029387  1.417195[4 rows x 2 columns]In [4]: df2Out[4]:          0         1string          number         abcdefghijklmop 1       0.539507 -1.059085     2       1.263722 -1.773187foo  1       1.625073  0.078650     2      -0.030827 -1.691805[4 rows x 2 columns]

建立店铺

In [9]: store = pd.HDFStore('test.h5',mode='w')In [10]: store.append('df1',df1)

这是长度的计算

In [12]: store.get_storer('df1').tableOut[12]: /df1/table (Table(4,)) ''  description := {  "index": Int64Col(shape=(), dflt=0, pos=0),  "values_block_0": Float64Col(shape=(2,), dflt=0.0, pos=1),  "number": Int64Col(shape=(), dflt=0, pos=2),  "string": StringCol(itemsize=13, shape=(), dflt='', pos=3)}  byteorder := 'little'  chunkshape := (1456,)  autoindex := True  colindexes := {    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False,    "number": Index(6, medium, shuffle, zlib(1)).is_csi=False,    "string": Index(6, medium, shuffle, zlib(1)).is_csi=False}

这是你现在得到的错误

In [13]: store.append('df1',df2)ValueError: Trying to store a string with len [15] in [string] column butthis column has a limit of [13]!Consider using min_itemsize to preset the sizes on these columns

min_itemsize
级别名称指定

In [14]: store.append('df',df1,min_itemsize={ 'string' : 15 })In [15]: store.get_storer('df').tableOut[15]: /df/table (Table(4,)) ''  description := {  "index": Int64Col(shape=(), dflt=0, pos=0),  "values_block_0": Float64Col(shape=(2,), dflt=0.0, pos=1),  "number": Int64Col(shape=(), dflt=0, pos=2),  "string": StringCol(itemsize=15, shape=(), dflt='', pos=3)}  byteorder := 'little'  chunkshape := (1394,)  autoindex := True  colindexes := {    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False,    "number": Index(6, medium, shuffle, zlib(1)).is_csi=False,    "string": Index(6, medium, shuffle, zlib(1)).is_csi=False}

附加

In [16]: store.append('df',df2)In [19]: store.dfOut[19]:          0         1string          number         abcdefghijklm   1       0.737976  0.840718     2       0.605763  1.797398foo  1       1.589278  0.104186     2       0.029387  1.417195abcdefghijklmop 1       0.539507 -1.059085     2       1.263722 -1.773187foo  1       1.625073  0.078650     2      -0.030827 -1.691805[8 rows x 2 columns]In [20]: store.close()


欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/zaji/5674440.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-17
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存