Error[8]: Undefined offset: 7, File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 121
File: /www/wwwroot/outofmemory.cn/tmp/plugin_ss_superseo_model_superseo.php, Line: 473, decode(

概述简短说明: 如果数据中有重复的列名,请确保在读取文件时重命名一列. 如果您的数据中有NaN等,请删除它们. 然后使用下面的正确答案合并. 可能是一个非常简单的问题. 我使用pandas.read_csv()读入了两个数据集. 我的数据在两个单独的csv中. 使用以下代码: import mibian import pandas as pd underlying 简短说明:

如果数据中有重复的列名,请确保在读取文件时重命名一列.

如果您的数据中有NaN等,请删除它们.

然后使用下面的正确答案合并.

可能是一个非常简单的问题.

我使用pandas.read_csv()读入了两个数据集.

我的数据在两个单独的csv中.

使用以下代码:

import mibian        import pandas as pd        underlying = pd.read_csv("txt1.csv",names=['dt1','price']);        options = pd.read_csv("txt2.txt",names=['dt2','ticker','maturity','strike','cP','px','yIEld','rF','T','rlzd10']);        merged = underlying.merge(options,left_on='dt1',right_on='dt2');

我的两个数据头看起来像这样:

>>> underlying.head();          0         10  20040326  3.5799871  20040329  3.6904942  20040330  3.7552473  20040331  3.7193734  20040401  3.728671

>>> options.head();         0     1         2     3     4      5     6   7      8         9                100  20130628  SVXY  20130817  32.5  call  39.22  32.5   0  0.005  0.136986   0.411224

所以我在任一数据集上的列0是我要合并的键,我想保留两个结果集中的所有数据.

我该怎么做呢?我在网上找到的所有例子都需要密钥,但我的结果中没有.

但是在连接上我得到以下错误:

Traceback (most recent call last):                              file "<stdin>",line 1,in <module>                              file "/Applications/Spyder.app/Contents/Resources/lib/python2.7/spyderlib/Widgets/externalshell/sitecustomize.py",line 540,in runfile                                execfile(filename,namespace)                              file "/Users/jasonmellone/.spyder2/.temp.py",line 12,in <module>                                merged = underlying.merge(options,right_on='dt2',how='outer');                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/frame.py",line 3723,in merge                                suffixes=suffixes,copy=copy)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 40,in merge                                return op.get_result()                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 197,in get_result                                result_data = join_op.get_result()                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 722,in get_result                                return BlockManager(result_blocks,self.result_axes)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 1954,in __init__                                self._set_ref_locs(do_refs=True)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 2091,in _set_ref_locs                                'have _ref_locs set' % (block,labels))                            AssertionError: Cannot create BlockManager._ref_locs because block [IntBlock: [dt1],1 x 372145,dtype: int64] with duplicate items [Index([u'dt1',u'price',u'dt2',u'ticker',u'maturity',u'strike',u'cP',u'px',u'yIEld',u'rF',u'T',u'rlzd10'],dtype='object')] does not have _ref_locs set

我搜索了我的数据集,没有重复.

谢谢!

解决方法 您仍然可以在列上合并:

merged = underlying.merge(options,left_on='0',right_on='0')

这将执行内部合并,因此只有两个数据集的交集,即如果你想要所有值,那么两列中的值都存在于哪里,然后指定外部:

merged = underlying.merge(options,right_on='0',how='outer')In [10]:  merged = underlying.merge(options,how='outer')mergedOut[10]:          0       1_x   1_y         2     3     4      5     6   7      8  
df = pd.read_csv('data.csv',names=['ID','Price'])
20040326 3.579987 NaN NaN NaN NaN NaN NaN NaN NaN 1 20040329 3.690494 NaN NaN NaN NaN NaN NaN NaN NaN 2 20040330 3.755247 NaN NaN NaN NaN NaN NaN NaN NaN 3 20040331 3.719373 NaN NaN NaN NaN NaN NaN NaN NaN 4 20040401 3.728671 NaN NaN NaN NaN NaN NaN NaN NaN 5 20130628 NaN SVXY 20130817 32.5 call 39.22 32.5 0 0.005 9 10 0 NaN NaN 1 NaN NaN 2 NaN NaN 3 NaN NaN 4 NaN NaN 5 0.136986 0.411224 [6 rows x 12 columns]

您必须重命名或移动上面发生冲突的列1_x和1_y.

最好将列重命名为事先合情合理的东西.
阅读csv时,您可以传递列名列表:

[+++] 总结

以上是内存溢出为你收集整理的在没有列名的pandas中合并两个数据帧(对pandas来说是新的)全部内容,希望文章能够帮你解决在没有列名的pandas中合并两个数据帧(对pandas来说是新的)所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

)
File: /www/wwwroot/outofmemory.cn/tmp/route_read.php, Line: 126, InsideLink()
File: /www/wwwroot/outofmemory.cn/tmp/index.inc.php, Line: 165, include(/www/wwwroot/outofmemory.cn/tmp/route_read.php)
File: /www/wwwroot/outofmemory.cn/index.php, Line: 30, include(/www/wwwroot/outofmemory.cn/tmp/index.inc.php)
在没有列名的pandas中合并两个数据帧(对pandas来说是新的)_python_内存溢出

在没有列名的pandas中合并两个数据帧(对pandas来说是新的)

在没有列名的pandas中合并两个数据帧(对pandas来说是新的),第1张

概述简短说明: 如果数据中有重复的列名,请确保在读取文件时重命名一列. 如果您的数据中有NaN等,请删除它们. 然后使用下面的正确答案合并. 可能是一个非常简单的问题. 我使用pandas.read_csv()读入了两个数据集. 我的数据在两个单独的csv中. 使用以下代码: import mibian import pandas as pd underlying 简短说明:

如果数据中有重复的列名,请确保在读取文件时重命名一列.

如果您的数据中有NaN等,请删除它们.

然后使用下面的正确答案合并.

可能是一个非常简单的问题.

我使用pandas.read_csv()读入了两个数据集.

我的数据在两个单独的csv中.

使用以下代码:

import mibian        import pandas as pd        underlying = pd.read_csv("txt1.csv",names=['dt1','price']);        options = pd.read_csv("txt2.txt",names=['dt2','ticker','maturity','strike','cP','px','yIEld','rF','T','rlzd10']);        merged = underlying.merge(options,left_on='dt1',right_on='dt2');

我的两个数据头看起来像这样:

>>> underlying.head();          0         10  20040326  3.5799871  20040329  3.6904942  20040330  3.7552473  20040331  3.7193734  20040401  3.728671

>>> options.head();         0     1         2     3     4      5     6   7      8         9                100  20130628  SVXY  20130817  32.5  call  39.22  32.5   0  0.005  0.136986   0.411224

所以我在任一数据集上的列0是我要合并的键,我想保留两个结果集中的所有数据.

我该怎么做呢?我在网上找到的所有例子都需要密钥,但我的结果中没有.

但是在连接上我得到以下错误:

Traceback (most recent call last):                              file "<stdin>",line 1,in <module>                              file "/Applications/Spyder.app/Contents/Resources/lib/python2.7/spyderlib/Widgets/externalshell/sitecustomize.py",line 540,in runfile                                execfile(filename,namespace)                              file "/Users/jasonmellone/.spyder2/.temp.py",line 12,in <module>                                merged = underlying.merge(options,right_on='dt2',how='outer');                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/frame.py",line 3723,in merge                                suffixes=suffixes,copy=copy)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 40,in merge                                return op.get_result()                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 197,in get_result                                result_data = join_op.get_result()                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/tools/merge.py",line 722,in get_result                                return BlockManager(result_blocks,self.result_axes)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 1954,in __init__                                self._set_ref_locs(do_refs=True)                              file "/library/Python/2.7/site-packages/pandas-0.13.0-py2.7-macosx-10.9-intel.egg/pandas/core/internals.py",line 2091,in _set_ref_locs                                'have _ref_locs set' % (block,labels))                            AssertionError: Cannot create BlockManager._ref_locs because block [IntBlock: [dt1],1 x 372145,dtype: int64] with duplicate items [Index([u'dt1',u'price',u'dt2',u'ticker',u'maturity',u'strike',u'cP',u'px',u'yIEld',u'rF',u'T',u'rlzd10'],dtype='object')] does not have _ref_locs set

我搜索了我的数据集,没有重复.

谢谢!

解决方法 您仍然可以在列上合并:

merged = underlying.merge(options,left_on='0',right_on='0')

这将执行内部合并,因此只有两个数据集的交集,即如果你想要所有值,那么两列中的值都存在于哪里,然后指定外部:

merged = underlying.merge(options,right_on='0',how='outer')In [10]:  merged = underlying.merge(options,how='outer')mergedOut[10]:          0       1_x   1_y         2     3     4      5     6   7      8  
df = pd.read_csv('data.csv',names=['ID','Price'])
20040326 3.579987 NaN NaN NaN NaN NaN NaN NaN NaN 1 20040329 3.690494 NaN NaN NaN NaN NaN NaN NaN NaN 2 20040330 3.755247 NaN NaN NaN NaN NaN NaN NaN NaN 3 20040331 3.719373 NaN NaN NaN NaN NaN NaN NaN NaN 4 20040401 3.728671 NaN NaN NaN NaN NaN NaN NaN NaN 5 20130628 NaN SVXY 20130817 32.5 call 39.22 32.5 0 0.005 9 10 0 NaN NaN 1 NaN NaN 2 NaN NaN 3 NaN NaN 4 NaN NaN 5 0.136986 0.411224 [6 rows x 12 columns]

您必须重命名或移动上面发生冲突的列1_x和1_y.

最好将列重命名为事先合情合理的东西.
阅读csv时,您可以传递列名列表:

总结

以上是内存溢出为你收集整理的在没有列名的pandas中合并两个数据帧(对pandas来说是新的)全部内容,希望文章能够帮你解决在没有列名的pandas中合并两个数据帧(对pandas来说是新的)所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1197066.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-03
下一篇 2022-06-03

发表评论

登录后才能评论

评论列表(0条)

保存