【Python学习】matplotlib学习总结——直方图

【Python学习】matplotlib学习总结——直方图,第1张

  • 绘制直方图

举个例子:

  • 任务:绘制150部电影时长统计直方图。

    • 1、150部电影一位数据为:y = [131,98,125,131,124,139,131,117,128,108,135,138,131,102,107,114,119,128,121,142,127,130,124,101,110,116,117,110,128,128,115,99,136,126,134,95,138,117,111,78,132,124,113,150,110,117,86,95,144,105,126,130,126,130,126,116,123,106,112,138,123,86,101,99,136,123,117,119,105,137,123,128,125,104,109,134,125,127,105,120,107,129,116,108,132,103,136,118,102,120,114,105,115,132,145,119,121,112,139,125,138,109,132,134,156,106,117,127,144,139,139,119,140,83,110,102,123,107,143,115,136,118,139,123,112,118,139,123,112,118,125,109,132,134,112,114,122,109,106,123,116,131,127,115,118,112,135,133,101,131]
    • 2、y轴为次数
    • 3、设置图片大小和分辨率
    • 4、设置组数
    • 4、设置栅格
    • 6、绘制网格
    • 7、保存图片
  • 直方图:

    plt.hist(x, bins=None,.....)
    
    • x:一维数据
    • bins:为组数
  • 把数据分为多少组进行统计呢?

    • 组数要适当,太小会有较大的统计误差,太大会表现的规律不明显
    • 组数:当数据在100个以内时,常分为5-12组
    • 组距:bin_width,根据数据最大值以及最小值之差进行判断,最终组数需要为整数最佳(也就是组距bin_width 选取整除即可)

 组数=  max ⁡ ( x ) − min ⁡ ( x ) b i n _ w i d t h {\text{ 组数= }}\frac{{\max (x) - \min (x)}}{{bin\_width}}  组数bin_widthmax(x)min(x)

from matplotlib import pyplot as plt
import matplotlib

# 设置中文
matplotlib.rc('font',family='SimSun')

# 150部电影的时长
y = [131,98,125,131,124,139,131,117,128,108,135,138,131,102,107,114,119,128,121,142,127,130,124,101,110,116,117,110,128,128,115,99,136,126,134,95,138,117,111,78,132,124,113,150,110,117,86,95,144,105,126,130,126,130,126,116,123,106,112,138,123,86,101,99,136,123,117,119,105,137,123,128,125,104,109,134,125,127,105,120,107,129,116,108,132,103,136,118,102,120,114,105,115,132,145,119,121,112,139,125,138,109,132,134,156,106,117,127,144,139,139,119,140,83,110,102,123,107,143,115,136,118,139,123,112,118,139,123,112,118,125,109,132,134,112,114,122,109,106,123,116,131,127,115,118,112,135,133,101,131]

# 设置组数
d = 6 # 组距
num_bins = ( max(y) - min(y) ) // d

# 设置图表大小
plt.figure(figsize=(10,5),dpi=80)

# 设置x轴刻度
plt.xticks(range(min(y),max(y)+d,d)) #max(y)+d 的目的是可以使得max(y)最终能取得到

# 绘制直方图,第一个参数是总的数据,第二个参数是组数
plt.hist(y,num_bins)

# 绘制栅格
plt.grid(alpha=0.3)

# 设置x轴label
plt.xlabel("电影的时长 单位:min")

# 设置y轴label
plt.ylabel("次数")

# 设置标题title
plt.title("150部电影的直方统计图")

# 保存直方图
plt.savefig("./1.svg")

# 显示直方图
plt.show()

  • 我们现在的图横坐标是电影的时间,而纵轴是次数,如果想要把纵轴变为频率应该如何设置呢?
    • 只需要修改plt.hist()中添加density=True即可
  • 代码如下:
from matplotlib import pyplot as plt
import matplotlib

# 设置中文
matplotlib.rc('font',family='SimSun')

# 150部电影的时长
y = [131,98,125,131,124,139,131,117,128,108,135,138,131,102,107,114,119,128,121,142,127,130,124,101,110,116,117,110,128,128,115,99,136,126,134,95,138,117,111,78,132,124,113,150,110,117,86,95,144,105,126,130,126,130,126,116,123,106,112,138,123,86,101,99,136,123,117,119,105,137,123,128,125,104,109,134,125,127,105,120,107,129,116,108,132,103,136,118,102,120,114,105,115,132,145,119,121,112,139,125,138,109,132,134,156,106,117,127,144,139,139,119,140,83,110,102,123,107,143,115,136,118,139,123,112,118,139,123,112,118,125,109,132,134,112,114,122,109,106,123,116,131,127,115,118,112,135,133,101,131]

# 设置组数
d = 6 # 组距
num_bins = ( max(y) - min(y) ) // d

# 设置图表大小
plt.figure(figsize=(10,5),dpi=80)

# 设置x轴刻度
plt.xticks(range(min(y),max(y)+d,d)) #max(y)+d 的目的是可以使得max(y)最终能取得到

# 绘制直方图,第一个参数是总的数据,第二个参数是组数
plt.hist(y,num_bins,density=True)

# 绘制栅格
plt.grid(alpha=0.3)

# 设置x轴label
plt.xlabel("电影的时长 单位:min")

# 设置y轴label
plt.ylabel("频率")

# 设置标题title
plt.title("150部电影的直方统计图")

# 保存直方图
plt.savefig("./1.svg")

# 显示直方图
plt.show()


更多的图形请参照:matplotlib.org

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/869281.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-13
下一篇 2022-05-13

发表评论

登录后才能评论

评论列表(0条)

保存