如何监控MySQL_随笔

首先介绍下 pt-stalk，它是 Percona-Toolkit 工具包中的一个工具，说起 PT 工具包大家都不陌生，平时常用的 pt-query-digest、 pt-online-schema-change 等工具都是出自于这个工具包，这里就不多介绍了。

pt-stalk 的主要功能是在出现问题时收集 OS 及 MySQL 的诊断信息，这其中包括：

1. OS 层面的 CPU、IO、内存、磁盘、网络等信息；

2. MySQL 层面的行锁等待、会话连接、主从复制，状态参数等信息。

而且 pt-stalk 是一个 Shell脚本，对于我这种看不懂 perl 的人来说比较友好，脚本里面的监控逻辑与监控命令也可以拿来参考，用于构建自己的监控体系。

三、使用

接着我们来看下如何使用这个工具。

pt-stalk 通常以后台服务形式监控 MySQL 并等待触发条件，当触发条件时收集相关诊断数据。

触发条件相关的参数有以下几个：

function：

∘ 默认为 status，代表监控 SHOW GLOBAL STATUS 的输出；

∘ 也可以设置为 processlist，代表监控 show processlist 的输出；

variable：

∘ 默认为 Threads_running，代表监控参数，根据上述监控输出指定具体的监控项；

threshold：

∘ 默认为 25，代表监控阈值，监控参数超过阈值，则满足触发条件；

∘ 监控参数的值非数字时，需要配合 match 参数一起使用，如 processlist 的 state 列；

cycles：

∘ 默认为 5，表示连续观察到五次满足触发条件时，才触发收集；

连接参数：host、password、port、socket。

其他一些重要参数：

iterations：该参数指定 pt-stalk 在触发收集几次后退出，默认会一直运行。

run-time：触发收集后，该参数指定收集多长时间的数据，默认 30 秒。

sleep：该参数指定在触发收集后，sleep 多久后继续监控，默认 300 秒。

interval：指定状态参数的检查频率，判断是否需要触发收集，默认 1 秒。

dest：监控数据存放路径，默认为 /var/lib/pt-stalk。

retention-time ：监控数据保留时长，默认 30 天。

daemonize：以后台服务运行，默认不开启。

log：后台运行日志，默认为 /var/log/pt-stalk.log。

collect：触发发生时收集诊断数据，默认开启。

∘ collect-gdb：收集 GDB 堆栈跟踪，需要 gdb 工具。

∘ collect-strace：收集跟踪数据，需要 strace 工具。

∘ collect-tcpdump：收集 tcpdump 数据，需要 tcpdump 工具。

你说的数据库对象时什么不太明白。。。

一般做数据库监控都是定时执行一条简单的sql 就OK了

类似：

select (0) from test；

不过这个功能很多有数据源的服务，或者监控系统都实现了。

weblogic、nagios啥的都有这个功能，自己配一下就行。

数据库监控端口也行，定时telnet

Monit安装与配置

一、简介

Monit是一个在类unix平台下用于监视进程、文件、目录和设备的软件，可以修复停止运作或运作异常的程序，适合处理那些由于多种原因导致的软件错误。

二、安装

假定下面的安装和配置均在root身份下进行。

安装很简单，下载monit的源代码（现在最新版本是4.10.1）monit-4.10.1.tar.gz，将其放到适合的目录中，然后解压，configure(默认设置即可)，make，make install 。具体在终端中使用如下命令：

tar –xzf monit-4.10.1.tar.gz

cd monit-4.10.1

./configure

make

make install

很快就可以安装完毕。

三、配置

安装完毕后,在monit源代码的目录将monit的配置文件monitrc拷贝到/etc目录下,使用命令:

cp monitrc /etc

注意/etc/monitrc这个文件的访问权限不能大于0700,所以可能还需要修改它的访问权限:

chmod 600 /etc/monitrc

然后打开/etc/monitrc这个文件进行配置，monit已经将大部分的配置的例子放在了里面，多数配置只需将配置前面的#（注释）去掉再做相应修改即可。我们主要用monit来监视tomcat服务器，所以配置如下：

set daemon 120 # 设置monit作为守护进程运行，并且每2分钟监视一次

# 2分钟是默认的时间间隔，从网上的看到的多个配置的例子

# 看到的时间间隔也是2分钟，应该是比较合理的

set logfile /var/log/monit.log # 设置日志文件的位置，如果要写入系统日志可以

# set logfile syslog

set httpd port 3000 and # monit内置了一个用于查看被监视服务

# 状态的http服务器,注意在防火墙中开启

# 该端口【1】,否则非localhost无法访问

use address 192.168.1.184 # 设置这个http服务器的地址

# 若设置成localhost则只允许本地访问

allow localhost # 允许本地访问

allow 192.168.1.1/255.255.255.0 # 允许内网访问

allow admin:monit11 # 设置使用用户名admin和密码monit11

# 来访问这个地址

set mailserver localhost # 设置邮件服务,设置后monit会将提示以

# 邮件的方式发送.这里使用localhost为邮

# 件服务器地址,前提是本地已安装并开启

# 了sendmail服务

set alert 88fly@163.com # 收邮件地址,如果要发送到多个地址

# 可以写多条这样的设置

# 下面设置监视tomcat

check process tomcat with pidfile /var/run/catalina.pid # 这个要另外说明【2】

start program = "/etc/init.d/tomcat start" # 设置启动命令

stop program = "/etc/init.d/tomcat stop" # 设置停止命令

if 9 restarts within 10 cycles then timeout # 设置在10个监视周期内重

# 启了9次则超时,不再监视

# 这个服务。原因另外说明【3】

if cpu usage >90% for 5 cycles then alert # 如果在5个周期内该服务

# 的cpu使用率都超过90%

# 则提示

# 若连续5个周期打开url都失败（120秒超时，超时也认为失败）

# 则重启服务

if failed url timeout 120 seconds for 5 cycles then restart

【1】可以使用命令：

/sbin/iptables -A INPUT -i eth0 -p tcp --dport 2812 -j ACCEPT

/sbin/service iptables save

【2】使用/var/run/catalina.pid这个pid文件来检查tomcat这个服务(服务名可以随便起),tomcat进程默认是不使用pid文件的,pid文件需要显式为tomcat设置,可以打开tomcat目录下的bin目录,打开catalina.sh文件,在开头（但不是第一行）处加入:

CATALINA_PID=/var/run/catalina.pid

即可指定pid文件,然后重启tomcat,这样就可以monit的配置中指定pid文件了。

【3】设置超时后不再监视是为了让服务不要一直重启,如果连续重启多次不成功,极有可能再重启下去也不会成功的。并且tomcat的重启需要占用大量系统资源,假如一直重启下去,反而会使其它服务也无法正常运作。

如果要监视其它服务，可以加入更多的监视逻辑，例如要监视mysql服务，可以：

check process mysql with pidfile /var/run/mysqld/mysqld.pid

start program = /etc/init.d/mysqld start"

stop program = "/etc/init.d/mysqld stop"

if failed host 127.0.0.1 port 3306 then restart

if 5 restarts within 5 cycles then timeout

监视ssh服务：

check process sshd with pidfile /var/run/sshd.pid

start program "/etc/init.d/sshd start"

stop program "/etc/init.d/sshd stop"

if failed port 22 protocol SSH then restart

if 5 restarts within 5 cycles then timeout

如果监视的服务比较多,可以将各个服务的监视逻辑放在不同的文件,然后使用include命令包含进来,使配置文件更加清晰。例如:

include /etc/monit/includes/mysqld

上面的设置完后,设置monit随系统启动,在/etc/inittab文件的最后加入

# Run monit in standard run-levels

mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc

然后使用命令

telinit q

启动monit。

四、要注意的问题

由于将monit设置成了守护进程,并且在inittab中加入了随系统启动的设置,则monit进程如果停止,init进程会将其重启,而monit又监视着其它的服务,这意味着monit所监视的服务不能使用一般的方法来停止,因为一停止,monit又会将其启动.要停止monit所监视的服务,应该使用monit stop name这样的命令,例如要停止tomcat:

monit stop tomcat

要停止全部monit所监视的服务可以使用monit stop all.

要启动某个服务可以用monit stop name这样的命令,启动全部则是monit start all.

以上转自：

今天研究了下monit 如上兄弟写的很详细，就直接拿来主义了，补充下短信告警

因公司有短信接口所以就直接发送告警，如下：

监控本机部分性能：

check system 127.0.0.1

if loadavg (5min) >4 for 4 times 5 cycles then exec "/etc/monit/script/sendsms sysload 5min >4"

if memory usage >90% then exec "/etc/monit/script/sendsms 127.0.0.1 memory useage>90%"

if cpu usage (user) >70% for 4 times within 5 cycles then exec "/etc/monit/script/sendsms cpu(user) >70%"

if cpu usage (system) >30% for 4 times within 5 cycles then exec "/etc/monit/script/sendsms cpu(system) >30% "

if cpu usage (wait) >20% for 4 times within 5 cycles then exec "/etc/monit/script/sendsms system busy! cpu(wait) >20%"

监控远程机器的部分端口：

check host Unicom_mobi with address 211.90.246.51

if failed icmp type echo count 10 with timeout 20 seconds then exec "/etc/monit/script/sendsms Unicom_mobi 211.90.246.51 ping failed!"

if failed port 22 type tcp with timeout 10 seconds for 2 times within 3 cycles then exec "/etc/monit/script/sendsms unicom 211.90.246.51:2222 connect failed!"

if failed port 9528 type tcp with timeout 10 seconds for 2 times within 3 cycles then exec "/etc/monit/script/sendsms unicom 211.90.246.51:9528 connect failed!"

if failed port 9529 type tcp with timeout 10 seconds for 2 times within 3 cycles then exec "/etc/monit/script/sendsms unicom 211.90.246.51:9529 connect failed!"

if failed port 9530 type tcp with timeout 10 seconds for 2 times within 3 cycles then exec "/etc/monit/script/sendsms unicom 211.90.246.51:9530 connect failed!"

monit好处是可以在监控故障设置重启服务和执行自定义脚本，如下

check file passwd path /etc/passwd

# if failed md5 checksum

# then exec "/usr/bin/killall -q monit"

2 check filesystem root with path /dev/mapper/VolGroup00-LogVol00

if space usage >80% for 5 times within 15 cycles then exec "/etc/monit/script/clear_core.sh"

else if succeed for 1 times within 2 cycles then exec "/etc/monit/script/sendsms '/dev/sda1 usage >90% clear core file succeed!'>/dev/null 2"

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/6136940.html

如何监控MySQL

发表评论

评论列表（0条）