python自带及pandas、numpy数据结构(一)_框架

1python自带数据结构：序列（如list）、映射（如字典）、集合（set）。

以下只介绍序列中的list：

创建list：

list1 = []

list1 = [1,2,3,4,5,6,7,8,9] #逗号隔开

list2 = [[1,2],[3,4],[5,6],[7,8]] #list2长度(len(list2))为2,list2[0] = [1,2]

liststring = list(“thisisalist”) #只用于创建字符串列表

索引list：

e = list1[0] #下标从零开始，用中括号

分片list：

es = list1[0:3]

es = list1[0:9:2] #步长在第二个冒号后

list拼接（list1append(obj)、加运算及乘运算）：

list长度：

list每个元素乘一个数值：

list2 = numpydot(list2,2)

list类似矩阵相乘（每个元素对应相乘取和）：

list3 = numpydot(list1,list1)

#要求相乘的两个list长度相同

list3 = numpydot(list2,list22)

#要求numpyshape(list2)和numpyshape(list22)满足“左行等于右列”的矩阵相乘条件，相乘结果numpyshape(list3)满足“左列右行”

2numpy数据结构：

Array：

产生array：

data=nparray([[1, 9, 6], [2, 8, 5], [3, 7, 4]])

data=nparray(list1)

data1 = npzeros(5) #data1shape = (5,),5列

data1 = npeye(5)

索引array:

datacut = data[0,2] #取第零行第二列，此处是6

切片array：

datacut = data[0:2,2] # array([6, 5])

array长度：

datashape

datasize

npshape(data)

npsize(data)

len(data)

array拼接：

#括号内也有一个括号（中括号或者小括号）！

d = npconcatenate((data,data))

d = npconcatenate((data,data),axis = 1) #对应行拼接

array加法：逐个相加

array乘法：

d = data data #逐个相乘

d = npdot(data,data) #矩阵相乘

d = data 3 #每个元素乘3

d = npdot(data,3) #每个元素乘3

array矩阵运算：

取逆 : nplinalginv(data)

转置：dataT

所有元素求和 : npsum(data)

生成随机数：nprandomnormal(loc=0, scale=10, size=None)

生成标准正态分布随机数组：nprandomnormal(size=(4,4))

生成二维随机数组：

nprandommultivariate_normal([0,0],npeye(2))

生成范围在0到1之间的随机矩阵(M,N)：

nprandomrandint(0,2,(M,N))

Matrix:

创建matrix：

mat1 = npmat([[1, 2, 3], [4, 5, 6]])

mat1 = npmat(list)

mat1 = npmat(data)

matrix是二维的，所有+，-，都是矩阵 *** 作。

matrix索引和分列：

mat1[0:2，1]

matrix转置：

nptranspose(mat1)

mat1transpose()

matrix拼接：

npconcatenate([mat1,mat1])

npconcatenate([mat1,mat1],axis = 1)

numpy数据结构总结：对于numpy中的数据结构的 *** 作方法基本相同：

创建：npmat(list),nparray(list)

矩阵乘：npdot(x,y)

转置：xT or nptranspose(x)

拼接：npconcatenate([x,y],axis = 1)

索引：mat[0:1,4],ary[0:1,4]

3pandas数据结构:

Series:

创建series：

s = pdSeries([[1,2,3],[4,5,6]],index = [‘a’,‘b’])

索引series：

s1 = s[‘b’]

拼接series：

pdconcat([s1,s1],axis = 1) #也可使用sappend(s)

DataFrame:

创建DaraFrame:

df = pdDataFrame([[1,2,3],[1,2,3]],index = ['a','b'],columns = ['x','y','z'])

df取某一列：

dfc1 =dfx

dfc1 = df[‘x’]

dfc2 = dfiloc[:,0] #用iloc方括号里是数字而不是column名！

dfc2 = dfiloc[:,0:3]

df取某一行：

dfr1 = dfiloc[0]

df1 = dfiloc[0:2]

df1 = df[0:2] #这种方法只能用于取一个区间

df取某个值：

dfc2 = dfiloc[0,0]

dfc2 = dfiloc[0:2,0:3]

pandas 读取指定单元格第2行，第3列

import pandas as pd

df = pdread_excel('测试xlsx')

cell = dfiat[0, 2]

[0, 2] 表示单元格的行列 pandas 默认跳过表头从第二行开始第三列是2 (012)

Pandas 提供了一系列函数，用于读取不同类型的文件。下面列出了 Pandas 中常用的读取文件的函数：

read_csv()：读取 CSV 格式的文件。

read_excel()：读取 Excel 格式的文件。

read_hdf()：读取 HDF5 格式的文件。

read_json()：读取 JSON 格式的文件。

read_pickle()：读取 Python 序列化格式的文件（即 pickle 文件）。

read_sql()：从数据库中读取数据。

这些函数都可以在 Pandas 的文档中找到详细的使用方法：>

此外，Pandas 还支持使用 Python 内置的 open() 函数读取文本文件，使用 pdread_table() 函数读取表格式的文件，使用 pdread_clipboard() 函数读取剪贴板中的数据等。

希望这些信息能帮助你。如果你有其他问题，请随时追问。

首先把需要筛选的列转化为pandascorestringsStringMethods,然后再用contains函数来得到布尔值的（因为直接取行/列为Series对象，而不能直接对Series对象使用contains）Series：

筛选可以直接利用[ ]来完成：

同理，可以使用str函数来对DataFrame的的行/列做对于字符串的 *** 作：

sstrlower()

sstrupper()

sstrlen()

sstrstrip()

sstrsplit(' ')

sstrreplace('@','$')

sstrcount()

sstrstartswith()

sstrendswith()

sstrfind()

sstrfindall()

sstrswapcase()

sstrisupper()

sstrislower()

sstrisnumeric()

以上就是关于python自带及pandas、numpy数据结构(一)全部的内容，包括:python自带及pandas、numpy数据结构(一)、Python中的pandas如何读取excel中指定单元格的值、Pandas只提供了读取什么文件的函数等相关内容解答，如果想了解更多相关内容，可以关注我们，你们的支持是我们更新的动力！

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/web/9663619.html

python自带及pandas、numpy数据结构(一)

发表评论

评论列表（0条）