numpy&pandas

numpy&pandas,第1张

基本的构建以及通过索引和切片方式选择性提取

代码:

import numpy as np
import pandas as  pd
dates=pd.date_range('20201024',periods=6)
df=pd.DataFrame(np.arange(24).reshape((6,4)),index=dates,columns=['a','b','c','d'])

print(df)
print(df['a'],df.a)

输出:

             a   b   c   d
2020-10-24   0   1   2   3
2020-10-25   4   5   6   7
2020-10-26   8   9  10  11
2020-10-27  12  13  14  15
2020-10-28  16  17  18  19
2020-10-29  20  21  22  23


2020-10-24     0
2020-10-25     4
2020-10-26     8
2020-10-27    12
2020-10-28    16
2020-10-29    20
Freq: D, Name: a, dtype: int32

代码:

print(df[0:3],df['20201025':'20201027'])
print(df.loc['20201025'])#纵向输出

输出:

            a  b   c   d
2020-10-24  0  1   2   3
2020-10-25  4  5   6   7
2020-10-26  8  9  10  11 



             a   b   c   d
2020-10-25   4   5   6   7
2020-10-26   8   9  10  11
2020-10-27  12  13  14  15

a    4
b    5
c    6
d    7
Name: 2020-10-25 00:00:00, dtype: int32

代码:

print(df.loc[:,['a','b']])#通过标签选择
print(df.loc['20201025',['a','b']])

输出:

             a   b
2020-10-24   0   1
2020-10-25   4   5
2020-10-26   8   9
2020-10-27  12  13
2020-10-28  16  17
2020-10-29  20  21


a    4
b    5
Name: 2020-10-25 00:00:00, dtype: int32

代码:

print(df.iloc[3])#通过位置选择,以列的形式展现
print(df.iloc[3,1])
print(df.iloc[3:5,1:3])

输出:

a    12
b    13
c    14
d    15
Name: 2020-10-27 00:00:00, dtype: int32


13


             b   c
2020-10-27  13  14
2020-10-28  17  18

代码:

print(df.iloc[[1,3,5],1:3])#逐个不连续选择行
print(df[df.a>4])#布尔选择,打印出a列大于4的所有行

输出:

             b   c
2020-10-25   5   6
2020-10-27  13  14
2020-10-29  21  22

             a   b   c   d
2020-10-26   8   9  10  11
2020-10-27  12  13  14  15
2020-10-28  16  17  18  19
2020-10-29  20  21  22  23

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/869739.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-13
下一篇 2022-05-13

发表评论

登录后才能评论

评论列表(0条)

保存