每日10行代码162：认识pandas中的series数据结构2_python

接上一章
在series中可以实现像Numpy的函数或Numpy风格的 *** 作，比如：

In [19]: obj2>0
Out[19]:
d     True
b     True
a    False
c     True
dtype: bool

In [20]: obj2[obj2>0]
Out[20]:
d    4
b    7
c    3
dtype: int64

把Series直接与常数比较，用布尔值数组进行过滤

In [21]: obj2*2
Out[21]:
d     8
b    14
a   -10
c     6
dtype: int64

In [22]: import numpy as np

In [23]: np.exp(obj2)
Out[23]:
d      54.598150
b    1096.633158
a       0.006738
c      20.085537
dtype: float64

也可以进行一定数学运算。

从另一个角度看Series,它更像是一个长度固定且有序的字典，对字典的一些 *** 作对它也有效，需要使用字典时，也可以使用它。

In [24]: 'b' in obj2
Out[24]: True

In [25]: 'f' in obj2
Out[25]: False

也可以用字典来生成Series对象

In [26]: sdata = {'Ohio':35000,'Texas':71000,'Oregon':160000,'Utah':5000}

In [27]: obj3 = pd.Series(sdata)

In [28]: obj3
Out[28]:
Ohio       35000
Texas      71000
Oregon    160000
Utah        5000
dtype: int64

当把字典传给Series的构造函数时，字典的键会成为Series的索引。字典是没顺序的，但生成Series时会按某种顺序生成数据。

In [29]: states = ['California','Ohio','Oregon','Texas']

In [30]: obj4 = pd.Series(sdata, index=states)

In [31]: obj4
Out[31]:
California         NaN
Ohio           35000.0
Oregon        160000.0
Texas          71000.0
dtype: float64

NaN 是缺失值的意思，在数据库中一般叫空值，可以用以下方法来检查空值

In [62]: pd.isnull(obj4)
Out[62]:
satate
California     True
Ohio          False
Oregon        False
Texas         False
Name: population, dtype: bool

In [63]: pd.notnull(obj4)
Out[63]:
satate
California    False
Ohio           True
Oregon         True
Texas          True
Name: population, dtype: bool

In [64]: obj4.isnull()
Out[64]:
satate
California     True
Ohio          False
Oregon        False
Texas         False
Name: population, dtype: bool

In [65]: obj4.notnull()
Out[65]:
satate
California    False
Ohio           True
Oregon         True
Texas          True
Name: population, dtype: bool

欢迎分享，转载请注明来源：内存溢出

原文地址: https://outofmemory.cn/langs/870585.html

每日10行代码162：认识pandas中的series数据结构2

发表评论

评论列表（0条）