看来您需要清空:
replace
,``strings
print (df)2016-10-31 2,144.782016-07-31 2,036.622016-04-30 1,916.602016-01-31 1,809.402015-10-31 1,711.972016-01-31 6,667.222015-01-31 5,373.592014-01-31 4,071.002013-01-31 3,050.202016-09-30 -0.062016-06-30 -1.882016-03-31 2015-12-31 -0.132015-09-30 2015-12-31 -0.142014-12-31 0.072013-12-3102012-12-310Name: val, dtype: objectprint (pd.to_numeric(df.str.replace(',',''), errors='coerce'))2016-10-31 2144.782016-07-31 2036.622016-04-30 1916.602016-01-31 1809.402015-10-31 1711.972016-01-31 6667.222015-01-31 5373.592014-01-31 4071.002013-01-31 3050.202016-09-30 -0.062016-06-30 -1.882016-03-31 NaN2015-12-31 -0.132015-09-30 NaN2015-12-31 -0.142014-12-31 0.072013-12-31 0.002012-12-31 0.00Name: val, dtype: float64
编辑:
如果采用追加,则有可能
dtype第一
df是
float和第二
object,因此需要投以
str第一,因为得到的混合
Dataframe-例如,第一行是
type
float行和最后一行是
strings:
print (pd.to_numeric(df.astype(str).str.replace(',',''), errors='coerce'))
也可以
types通过以下方式检查:
print (df.apply(type))2016-09-30 <class 'float'>2016-06-30 <class 'float'>2015-12-31 <class 'float'>2014-12-31 <class 'float'>2014-01-31 <class 'str'>2013-01-31 <class 'str'>2016-09-30 <class 'str'>2016-06-30 <class 'str'>2016-03-31 <class 'str'>2015-12-31 <class 'str'>2015-09-30 <class 'str'>2015-12-31 <class 'str'>2014-12-31 <class 'str'>2013-12-31 <class 'str'>2012-12-31 <class 'str'>Name: val, dtype: object
编辑1:
如果需要将解决方案应用于所有
Dataframe使用领域
apply:
df1 = df.apply(lambda x: pd.to_numeric(x.astype(str).str.replace(',',''), errors='coerce'))print (df1) Revenue Other, NetDate 2016-09-30 24.73 -0.062016-06-30 18.73 -1.882016-03-31 17.56 NaN2015-12-31 29.14 -0.132015-09-30 22.67 NaN2015-12-31 95.85 -0.142014-12-31 84.58 0.072013-12-31 58.33 0.002012-12-31 29.63 0.002016-09-30 243.91 -0.802016-06-30 230.77 -1.122016-03-31 216.58 1.322015-12-31 206.23 -0.052015-09-30 192.82 -0.342015-12-31 741.15 -1.372014-12-31 556.28 -1.902013-12-31 414.51 -1.482012-12-31 308.82 0.102016-10-31 2144.78 41.982016-07-31 2036.62 35.002016-04-30 1916.60 -11.662016-01-31 1809.40 27.092015-10-31 1711.97 -3.442016-01-31 6667.22 14.132015-01-31 5373.59 -18.692014-01-31 4071.00 -4.872013-01-31 3050.20 -5.70
print(df1.dtypes)Revenue float64Other, Net float64dtype: object
但是如果只需要转换
Dataframe使用
subset和的某些列
apply:
cols = ['Revenue', ...]df[cols] = df[cols].apply(lambda x: pd.to_numeric(x.astype(str) .str.replace(',',''), errors='coerce'))print (df) Revenue Other, NetDate 2016-09-30 24.73 -0.062016-06-30 18.73 -1.882016-03-31 17.562015-12-31 29.14 -0.132015-09-30 22.672015-12-31 95.85 -0.142014-12-31 84.58 0.072013-12-31 58.33 02012-12-31 29.63 02016-09-30 243.91 -0.82016-06-30 230.77 -1.122016-03-31 216.58 1.322015-12-31 206.23 -0.052015-09-30 192.82 -0.342015-12-31 741.15 -1.372014-12-31 556.28 -1.92013-12-31 414.51 -1.482012-12-31 308.82 0.12016-10-31 2144.78 41.982016-07-31 2036.62 352016-04-30 1916.60 -11.662016-01-31 1809.40 27.092015-10-31 1711.97 -3.442016-01-31 6667.22 14.132015-01-31 5373.59 -18.692014-01-31 4071.00 -4.872013-01-31 3050.20 -5.7
print(df.dtypes)Revenue float64Other, Net objectdtype: object
编辑2:
您的红利问题的解决方案:
df = pd.Dataframe({'A':['q','e','r'], 'B':['4','5','q'], 'C':[7,8,9.0], 'D':['1,000','3','50,000'], 'E':['5','3','6'], 'F':['w','e','r']})print (df) A B C D E F0 q 4 7.0 1,000 5 w1 e 5 8.0 3 3 e2 r q 9.0 50,000 6 r#first apply original solutiondf1 = df.apply(lambda x: pd.to_numeric(x.astype(str).str.replace(',',''), errors='coerce'))print (df1) A B C D E F0 NaN 4.0 7.0 1000 5 NaN1 NaN 5.0 8.0 3 3 NaN2 NaN NaN 9.0 50000 6 NaN#mask where all columns are NaN - string columnsmask = df1.isnull().all()print (mask)A TrueB FalseC FalseD FalseE FalseF Truedtype: bool#replace NaN to string columnsdf1.loc[:, mask] = df1.loc[:, mask].combine_first(df)print (df1) A B C D E F0 q 4.0 7.0 1000 5 w1 e 5.0 8.0 3 3 e2 r NaN 9.0 50000 6 r
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)