基本的构建以及通过索引和切片方式选择性提取
代码:
import numpy as np
import pandas as pd
dates=pd.date_range('20201024',periods=6)
df=pd.DataFrame(np.arange(24).reshape((6,4)),index=dates,columns=['a','b','c','d'])
print(df)
print(df['a'],df.a)
输出:
a b c d
2020-10-24 0 1 2 3
2020-10-25 4 5 6 7
2020-10-26 8 9 10 11
2020-10-27 12 13 14 15
2020-10-28 16 17 18 19
2020-10-29 20 21 22 23
2020-10-24 0
2020-10-25 4
2020-10-26 8
2020-10-27 12
2020-10-28 16
2020-10-29 20
Freq: D, Name: a, dtype: int32
代码:
print(df[0:3],df['20201025':'20201027'])
print(df.loc['20201025'])#纵向输出
输出:
a b c d
2020-10-24 0 1 2 3
2020-10-25 4 5 6 7
2020-10-26 8 9 10 11
a b c d
2020-10-25 4 5 6 7
2020-10-26 8 9 10 11
2020-10-27 12 13 14 15
a 4
b 5
c 6
d 7
Name: 2020-10-25 00:00:00, dtype: int32
代码:
print(df.loc[:,['a','b']])#通过标签选择
print(df.loc['20201025',['a','b']])
输出:
a b
2020-10-24 0 1
2020-10-25 4 5
2020-10-26 8 9
2020-10-27 12 13
2020-10-28 16 17
2020-10-29 20 21
a 4
b 5
Name: 2020-10-25 00:00:00, dtype: int32
代码:
print(df.iloc[3])#通过位置选择,以列的形式展现
print(df.iloc[3,1])
print(df.iloc[3:5,1:3])
输出:
a 12
b 13
c 14
d 15
Name: 2020-10-27 00:00:00, dtype: int32
13
b c
2020-10-27 13 14
2020-10-28 17 18
代码:
print(df.iloc[[1,3,5],1:3])#逐个不连续选择行
print(df[df.a>4])#布尔选择,打印出a列大于4的所有行
输出:
b c
2020-10-25 5 6
2020-10-27 13 14
2020-10-29 21 22
a b c d
2020-10-26 8 9 10 11
2020-10-27 12 13 14 15
2020-10-28 16 17 18 19
2020-10-29 20 21 22 23
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)