在Linux Ubuntu上加载avg古怪_系统运维

概述在过去的几天里,我一直在努力去理解我们基础设施中发生的奇怪现象,但是我无法理解它,所以我转向你们,给我一些提示. 我一直注意到Graphite,load_avg的峰值大约每2个小时发生一次致命的规律性 – 它不是2小时但是非常规律.我附上了我从Graphite获取的截图我一直在调查这个问题 – 这种情况的规律性导致我认为它是某种类似的cron工作或类似的东西,但这些服务器上没有运行cronjob 在过去的几天里,我一直在努力去理解我们基础设施中发生的奇怪现象,但是我无法理解它,所以我转向你们,给我一些提示.

我一直注意到Graphite,load_avg的峰值大约每2个小时发生一次致命的规律性 – 它不是2小时但是非常规律.我附上了我从Graphite获取的截图

我一直在调查这个问题 – 这种情况的规律性导致我认为它是某种类似的cron工作或类似的东西,但这些服务器上没有运行cronjobs – 实际上这些是在Rackspace云中运行的VM.我正在寻找的是某种可能引起这些问题的迹象,以及如何进一步调查这些问题.

服务器相当空闲 – 这是一个临时环境,因此几乎没有流量进入/它们应该没有负载.这些都是4个虚拟核心VM.我所知道的是,我们每隔10秒钟就会收集一堆Graphite样本,但如果这是负载的原因,那么我预计它会持续很高而不是每隔2小时在不同的服务器中发生波动.

任何帮助如何调查这将非常感谢！

以下是针对app01的sar的一些数据 – 这是上图中的第一个蓝色尖峰 – 我无法从数据中得出任何结论.也不是每隔半小时(不是每2小时)看到发生的字节写入峰值是由于厨师 – 客户每30分钟运行一次.我会尝试收集更多数据,即使我已经这样做了,但也无法从中得出任何结论.

加载

09:55:01 PM   runq-sz  pList-sz   ldavg-1   ldavg-5  ldavg-15   blocked10:05:01 PM         0       125      1.28      1.26      0.86         010:15:01 PM         0       125      0.71      1.08      0.98         010:25:01 PM         0       125      4.10      3.59      2.23         010:35:01 PM         0       125      0.43      0.94      1.46         310:45:01 PM         0       125      0.25      0.45      0.96         010:55:01 PM         0       125      0.15      0.27      0.63         011:05:01 PM         0       125      0.48      0.33      0.47         011:15:01 PM         0       125      0.07      0.28      0.40         011:25:01 PM         0       125      0.46      0.32      0.34         011:35:01 PM         2       130      0.38      0.47      0.42         011:45:01 PM         2       131      0.29      0.40      0.38         011:55:01 PM         2       131      0.47      0.53      0.46         011:59:01 PM         2       131      0.66      0.70      0.55         012:00:01 AM         2       131      0.81      0.74      0.57         0

中央处理器

09:55:01 PM     cpu     %user     %nice   %system   %iowait    %steal     %IDle10:05:01 PM     all      5.68      0.00      3.07      0.04      0.11     91.1010:15:01 PM     all      5.01      0.00      1.70      0.01      0.07     93.2110:25:01 PM     all      5.06      0.00      1.74      0.02      0.08     93.1110:35:01 PM     all      5.74      0.00      2.95      0.06      0.13     91.1210:45:01 PM     all      5.05      0.00      1.76      0.02      0.06     93.1010:55:01 PM     all      5.02      0.00      1.73      0.02      0.09     93.1311:05:01 PM     all      5.52      0.00      2.74      0.05      0.08     91.6111:15:01 PM     all      4.98      0.00      1.76      0.01      0.08     93.1711:25:01 PM     all      4.99      0.00      1.75      0.01      0.06     93.1911:35:01 PM     all      5.45      0.00      2.70      0.04      0.05     91.7611:45:01 PM     all      5.00      0.00      1.71      0.01      0.05     93.2311:55:01 PM     all      5.02      0.00      1.72      0.01      0.06     93.1911:59:01 PM     all      5.03      0.00      1.74      0.01      0.06     93.1612:00:01 AM     all      4.91      0.00      1.68      0.01      0.08     93.33

09:55:01 PM       tps      rtps      wtps   bread/s   bwrtn/s10:05:01 PM      8.88      0.15      8.72      1.21    422.3810:15:01 PM      1.49      0.00      1.49      0.00     28.4810:25:01 PM      1.54      0.00      1.54      0.03     29.6110:35:01 PM      8.35      0.04      8.31      0.32    411.7110:45:01 PM      1.58      0.00      1.58      0.00     30.0410:55:01 PM      1.52      0.00      1.52      0.00     28.3611:05:01 PM      8.32      0.01      8.31      0.08    410.3011:15:01 PM      1.54      0.01      1.52      0.43     29.0711:25:01 PM      1.47      0.00      1.47      0.00     28.3911:35:01 PM      8.28      0.00      8.28      0.00    410.9711:45:01 PM      1.49      0.00      1.49      0.00     28.3511:55:01 PM      1.46      0.00      1.46      0.00     27.9311:59:01 PM      1.35      0.00      1.35      0.00     26.8312:00:01 AM      1.60      0.00      1.60      0.00     29.87

网络：

10:25:01 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s10:35:01 PM        lo      8.36      8.36      2.18      2.18      0.00      0.00      0.0010:35:01 PM      eth1      7.07      4.77      5.24      2.42      0.00      0.00      0.0010:35:01 PM      eth0      2.30      1.99      0.24      0.51      0.00      0.00      0.0010:45:01 PM        lo      8.35      8.35      2.18      2.18      0.00      0.00      0.0010:45:01 PM      eth1      3.69      3.45      0.65      2.22      0.00      0.00      0.0010:45:01 PM      eth0      1.50      1.33      0.15      0.36      0.00      0.00      0.0010:55:01 PM        lo      8.36      8.36      2.18      2.18      0.00      0.00      0.0010:55:01 PM      eth1      3.66      3.40      0.64      2.19      0.00      0.00      0.0010:55:01 PM      eth0      0.79      0.87      0.08      0.29      0.00      0.00      0.0011:05:01 PM        lo      8.36      8.36      2.18      2.18      0.00      0.00      0.0011:05:01 PM      eth1      7.29      4.73      5.25      2.41      0.00      0.00      0.0011:05:01 PM      eth0      0.82      0.89      0.09      0.29      0.00      0.00      0.0011:15:01 PM        lo      8.34      8.34      2.18      2.18      0.00      0.00      0.0011:15:01 PM      eth1      3.67      3.30      0.64      2.19      0.00      0.00      0.0011:15:01 PM      eth0      1.27      1.21      0.11      0.34      0.00      0.00      0.0011:25:01 PM        lo      8.32      8.32      2.18      2.18      0.00      0.00      0.0011:25:01 PM      eth1      3.43      3.35      0.63      2.20      0.00      0.00      0.0011:25:01 PM      eth0      1.13      1.09      0.10      0.32      0.00      0.00      0.0011:35:01 PM        lo      8.36      8.36      2.18      2.18      0.00      0.00      0.0011:35:01 PM      eth1      7.16      4.68      5.25      2.40      0.00      0.00      0.0011:35:01 PM      eth0      1.15      1.12      0.11      0.32      0.00      0.00      0.0011:45:01 PM        lo      8.37      8.37      2.18      2.18      0.00      0.00      0.0011:45:01 PM      eth1      3.71      3.51      0.65      2.20      0.00      0.00      0.0011:45:01 PM      eth0      0.75      0.86      0.08      0.29      0.00      0.00      0.0011:55:01 PM        lo      8.30      8.30      2.18      2.18      0.00      0.00      0.0011:55:01 PM      eth1      3.65      3.37      0.64      2.20      0.00      0.00      0.0011:55:01 PM      eth0      0.74      0.84      0.08      0.28      0.00      0.00      0.00

对于对cronjobs感到好奇的人.以下是在服务器上设置的所有cronjobs的摘要(我选择了app01,但这也发生在其他一些服务器上,同时设置了相同的cronjobs)

$ls -ltr /etc/cron*-rw-r--r-- 1 root root  722 Apr  2  2012 /etc/crontab/etc/cron.monthly:total 0/etc/cron.hourly:total 0/etc/cron.weekly:total 8-rwxr-xr-x 1 root root 730 Dec 31  2011 apt-xAPIan-index-rwxr-xr-x 1 root root 907 Mar 31  2012 man-db/etc/cron.daily:total 68-rwxr-xr-x 1 root root  2417 Jul  1  2011 popularity-contest-rwxr-xr-x 1 root root   606 Aug 17  2011 mlocate-rwxr-xr-x 1 root root   372 Oct  4  2011 logrotate-rwxr-xr-x 1 root root   469 Dec 16  2011 sysstat-rwxr-xr-x 1 root root   314 Mar 30  2012 aptitude-rwxr-xr-x 1 root root   502 Mar 31  2012 bsdmainutils-rwxr-xr-x 1 root root  1365 Mar 31  2012 man-db-rwxr-xr-x 1 root root  2947 Apr  2  2012 standard-rwxr-xr-x 1 root root   249 Apr  9  2012 passwd-rwxr-xr-x 1 root root   219 Apr 10  2012 apport-rwxr-xr-x 1 root root   256 Apr 12  2012 dpkg-rwxr-xr-x 1 root root   214 Apr 20  2012 update-notifIEr-common-rwxr-xr-x 1 root root 15399 Apr 20  2012 apt-rwxr-xr-x 1 root root  1154 Jun  5  2012 ntp/etc/cron.d:total 4-rw-r--r-- 1 root root 395 Jan  6 18:27 sysstat$sudo ls -ltr /var/spool/cron/crontabs total 0$

正如你所看到的,没有HOURLY cronjobs.只有每天/每周等

我已经收集了大量的统计数据(vmstat,mpstat,iostat) – 无论如何我都在努力,我只是看不到任何暗示任何VM组件行为不端的线索……我开始倾向于在虚拟机管理程序中潜在的问题.随意看看stats在“冒犯”时间周围以sar -q输出开始,然后你可以看到vm,mp和iostats ….

基本上它对我来说仍然是一个完全神秘的…

解决方法有趣.

首先,你可以增加sar记录的频率.而不是10分钟,尝试每分钟记录. sysstat cronjob是可配置的.

接下来,尝试编写以下命令的脚本.

ps auxf > /tmp/ps.outvmstat 1 50 > /tmp/vm.outmpstat -P ALL 1 50 > /tmp/mp.outiostat -xdk 1 50 > /tmp/io.outcat /proc/meminfo > /tmp/meminfo.out

当负载平均值手动或通过cron增加时,在每次迭代时收集这组数据.拥有至少一个完整工作日的数据会很好.

现在,我知道服务器处于空闲状态,但仍有一些应用程序必须运行.这些是什么？

是否有可能运行一些分析工具,如perf或oprofile.

是否有任何服务器硬件组件被更改？甚至像固件升级或软件升级一样无害.

嘿,一个问题.你正在运行的调度程序是什么.我相信这是cfq,你有机会把它变成noop.将elevator = noop放入内核命令行参数并重新启动系统,看它是否改进了它.

总结

以上是内存溢出为你收集整理的在Linux Ubuntu上加载avg古怪全部内容，希望文章能够帮你解决在Linux Ubuntu上加载avg古怪所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址: https://outofmemory.cn/yw/1043683.html

在Linux Ubuntu上加载avg古怪

发表评论

评论列表（0条）