MHA(Master High Availability)是一套相对成熟的MysqL高可用方案,能做到在0~30s内自动完成数据库的故障切换 *** 作,在master服务器不宕机的情况下,基本能保证数据的一致性。
它由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。其中,MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave上。MHA Node则运行在每个MysqL节点上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它自动将最新数据的slave提升为master,然后将其它所有的slave指向新的master。
在MHA自动故障切换过程中,MHA试图保存master的二进制日志,从而最大程度地保证数据不丢失,当这并不总是可行的,譬如,主服务器硬件故障或无法通过ssh访问,MHA就没法保存二进制日志,这样就只进行了故障转移但丢失了最新数据。可结合MysqL 5.5中推出的半同步复制来降低数据丢失的风险。
MHA软件由两部分组成:Manager工具包和Node工具包,具体说明如下:
MHA Manager:
1. masterha_check_ssh:检查MHA的SSH配置状况
2. masterha_check_repl:检查MysqL的复制状况
3. masterha_manager:启动MHA
4. masterha_check_status:检测当前MHA运行状态
5. masterha_master_monitor:检测master是否宕机
6. masterha_master_switch:控制故障转移(自动或手动)
7. masterha_conf_host:添加或删除配置的server信息
8. masterha_stop:关闭MHA
MHA Node:
save_binary_logs:保存或复制master的二进制日志
apply_diff_relay_logs:识别差异的relay log并将差异的event应用到其它slave中
filter_MysqLbinlog:去除不必要的RolLBACK事件(MHA已不再使用这个工具)
purge_relay_logs:消除中继日志(不会堵塞sql线程)
另有如下几个脚本需自定义:
1. master_ip_failover:管理VIP
2. master_ip_online_change:
3. masterha_secondary_check:当MHA manager检测到master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。
4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。
集群信息
角色 IP地址 ServerID 类型
Master 192.168.244.10 1 写入
Candicate master 192.168.244.20 2 读
Slave 192.168.244.30 3 读
Monitor host 192.168.244.40 监控集群组
注: *** 作系统均为RHEL 6.7
其中,master对外提供写服务,备选master提供读服务,slave也提供相关的读服务,一旦master宕机,将会把备选master提升为新的master,slave指向新的master
一、在所有节点上安装MHA node
1. 在MysqL服务器上安装MHA node所需的perl模块(DBD:MysqL)
# yum install perl-dbd-mysql -y
2. 在所有的节点上安装mha node
下载地址为:https://code.Google.com/p/MysqL-master-ha/wiki/Downloads?tm=2
由于该网址在国内被墙,相关文件下载后,放到了个人网盘中,http://pan.baIDu.com/s/1boS31vT,有需要的童鞋可自行下载。
# tar xvf mha4MysqL-node-0.56.tar.gz
# cd mha4MysqL-node-0.56
# perl Makefile.PL
Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm line 6.BEGIN Failed--compilation aborted at inc/Module/Install/Can.pm line 6.Compilation Failed in require at inc/Module/Install.pm line 283.Cant locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Makefile.pm line 4.BEGIN Failed--compilation aborted at inc/Module/Install/Makefile.pm line 4t locate ExtUtils/MM_Unix.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Metadata.pm line 349.VIEw Code
通过报错可以看出,是相关依赖包没有安装。
# yum install perl-ExtUtils-MakeMaker -y
# perl Makefile.PL
*** Module::autoInstall version 1.03*** Checking for Perl dependencIEs...Cant locate CPAN.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/autoInstall.pm line 277.
# yum install perl-CPAN -y
# perl Makefile.PL
*** Module::autoInstall version Perl dependencIEs...[Core Features]- DBI ...loaded. (1.609)- DBD::MysqL ...loaded. (4.013)*** Module::autoInstall configuration finished.Checking if your kit is complete...Looks goodWriting Makefile for mha4MysqL::nodeVIEw Code
# make
# make install
至此,MHA node节点安装完毕,会在/usr/local/bin下生成以下脚本文件
# ll /usr/local/bin/total 44-r-xr-xr-x 1 root root 16367 Jul 20 07:00 apply_diff_relay_logs-r-xr-xr-x 1 root root 4807 Jul filter_MysqLbinlog-r-xr-xr-x 8261 Jul purge_relay_logs-r-xr-xr-x 7525 Jul 00 save_binary_logs
二、在Monitor host节点上部署MHA Manager
# tar xvf mha4MysqL-manager-0.56.tar.gz
# cd mha4MysqL-manager-0.56
# perl Makefile.PL
*** Module::autoInstall version Perl dependencIEs...[Core Features]- DBI ...loaded. ()- DBD::MysqL ...loaded. ()- Time::HiRes ...missing.- Config::Tiny ...missing.- Log::dispatch ...missing.- Parallel::ForkManager ...missing.- MHA::NodeConst ...missing.==> auto-install the 5 mandatory module(s) from CPAN? [y] y*** DependencIEs will be installed the next time you type make'.*** your kit is complete...Looks goodWarning: prerequisite Config::Tiny 0 not found.Warning: prerequisite Log::dispatch not found.Warning: prerequisite MHA::NodeConst not found.Warning: prerequisite Parallel::ForkManager not found.Warning: prerequisite Time::HiRes not found.Writing Makefile for mha4MysqL::managerVIEw Code
# make
# make install
执行完毕后,会在/usr/local/bin下新增以下几个文件
# ll /usr/local/bin/40-r-xr-xr-x 1991 Jul 00:50 masterha_check_repl-r-xr-xr-x 1775 Jul masterha_check_ssh-r-xr-xr-x 1861 Jul masterha_check_status-r-xr-xr-x 3197 Jul masterha_conf_host-r-xr-xr-x 2513 Jul masterha_manager-r-xr-xr-x 2161 Jul masterha_master_monitor-r-xr-xr-x 2369 Jul masterha_master_switch-r-xr-xr-x 5167 Jul masterha_secondary_check-r-xr-xr-x 1735 Jul 50 masterha_stop
三、配置SSH登录无密码验证
1. 在manager上配置到所有Node节点的无密码验证
# ssh-keygen
一路按“Enter”
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
2. 在Master(192.168.244.10)上配置
# ssh-keygen
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
3. 在Candicate master(192.168.244.20)上配置
# ssh-keygen
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
4. 在Slave(192.168.244.30)上配置
# ssh-keygen
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
# ssh-copy-ID -i /root/.ssh/ID_rsa.pub [email protected]
四、搭建主从复制环境
1. 在Master上执行备份
# MysqLdump --master-data=2 --single-transaction -R --triggers -A > all.sql
其中,-R是备份存储过程,--triggers是备份触发器 -A代表全库
2. 在Master上创建复制用户
MysqL> grant replication slave on *.* to 'repl'@192.168.244.%' IDentifIEd by ;query OK,0 rows affected (0.09 sec)
3. 查看备份文件all.sql中的CHANGE MASTER语句
# head -n 30 all.sql
-- CHANGE MASTER TO MASTER_LOG_file='MysqL-bin.000002',MASTER_LOG_POS=120;
4. 将备份文件复制到Candicate master和Slave上
# scp all.sql 192.168.244.20:/root/
# scp all.sql 192.168.244.30:/root/
5. 在Candicate master上搭建从库
# MysqL < all.sql
设置复制信息
MysqL> CHANGE MASTER TO -> MASTER_HOST=192.168.244.10,1)">-> MASTER_USER-> MASTER_PASSWORD-> MASTER_LOG_fileMysqL-bin.000002-> MASTER_LOG_POS=1200 rows affected,2 warnings (0.19 sec)MysqL> start slave;query OK,1); Font-weight: bold">0.02> show slave status\G
6. 在Slave上搭建从库
7. slave服务器设置为read only
MysqLset global read_only10.04 sec)
8. 在Master中创建监控用户
MysqLgrant all privileges monitor%monitor1230.07 sec)
五、 配置MHA
1. 在Monitor host(192.168.244.40)上创建MHA工作目录,并且创建相关配置文件
# mkdir -p /etc/masterha
# vim /etc/masterha/app1.cnf
[server default]manager_log=/masterha/app1/manager.log //设置manager的日志manager_workdir=/masterha/app1 设置manager的工作目录master_binlog_dir=/var/lib/MysqL 设置master默认保存binlog的位置,以便MHA可以找到master的日志master_ip_failover_script= /usr/local/bin/master_ip_failover 设置自动failover时候的切换脚本master_ip_online_change_script= /usr/local/bin/master_ip_online_change 设置手动切换时候的切换脚本user=monitor 设置监控用户password=monitor123 设置监控用户的密码Ping_interval=1 设置监控主库,发送Ping包的时间间隔,默认是3秒,尝试三次没有回应的时候进行自动failoverremote_workdir=/tmp 设置远端MysqL在发生切换时binlog的保存位置repl_user=repl 设置复制环境中的复制用户名repl_password=repl 设置复制用户的密码report_script=/usr/local/bin/send_report 设置发生切换后发送的报警的脚本secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.244.20 -s 244.30 --user=root --master_host=244.10 --master_ip=244.10 --master_port=3306 一旦MHA到master的监控之间出现问题,MHA Manager将会判断其它两个slave是否能建立到master_ip 3306端口的连接shutdown_script="" 设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂)ssh_user=root 设置ssh的登录用户名[server1]hostname=244.10port=3306[server2]244.20candIDate_master=1 设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中最新的slavecheck_repl_delay=0 默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candIDate_master=1的主机非常有用,因为它保证了这个候选主在切换过程中一定是最新的master[server3]244.303306
注意:
1> 在编辑该文件时,后面的注释切记要去掉,MHA并不会将后面的内容识别为注释。
2> 配置文件中设置了master_ip_failover_script,secondary_check_script,master_ip_online_change_script,report_script,对应的文件见文章末 尾。
2. 设置relay log清除方式(在每个Slave上)
MysqL> set global relay_log_purge=0 rows affected (0.00 sec)
MHA在发生切换过程中,从库在恢复的过程中,依赖于relay log的相关信息,所以我们这里要将relay log的自动清楚设置为OFF,采用手动清楚relay log的方式。
在默认情况下,从服务器上的中继日志会在sql线程执行完后被自动删除。但是在MHA环境中,这些中继日志在恢复其它从服务器时可能会被用到,因此需要禁用中继日志的自动清除。改为定期手动清除sql线程应用完的中继日志。
在ext3文件系统下,删除大的文件需要一定的时间,这样会导致严重的复制延迟,所以在linux中,一般都是通过硬链接的方式来删除大文件。
3. 设置定期清理relay脚本
MHA节点中包含了purge_relay_logs脚本,它可以为relay log创建硬链接,执行set global relay_log_purge=1,等待几秒钟以便sql线程切换到新的中继日志,再执行set global relay_log_purge=0。
下面看看脚本的使用方法:
# purge_relay_logs --user=monitor --password=monitor123 -disable_relay_log_purge --workdir=/tmp/
2017-04-24 20:27:46: purge_relay_logs script started. Found relay_log.info: /var/lib/MysqL/relay-log.info opening /var/lib/MysqL/MysqLd-relay-bin.000001 .. opening /var/lib/MysqL/MysqLd-relay-bin.000002000003000004000005000006 .. Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleePing a few seconds so that sql thread can delete older relay log files ( it keeps up); SET GLOBAL relay_log_purge=0; .. ok.50: All relay log purging operations succeeded.
其中,
--user:MysqL用户名
--password:MysqL用户的密码
--host: MysqLserver地址
--workdir:指定创建relay log的硬链接的位置,默认的是/var/tmp。由于系统不同分区创建硬链接文件会失败,故需要指定具体的硬链接的位置。
--disable_relay_log_purge:默认情况下,如果relay_log_purge=1,则脚本会直接退出。通过设置这个参数,该脚本会首先将relay_log_purge设置为1,清除掉relay log后,再将该参数设置为0。
设置crontab来定期清理relay log
MHA在切换的过程中会直接调用MysqLbinlog命令,故需要在环境变量中指定MysqLbinlog的具体路径。
# vim /etc/cron.d/purge_relay_logs
0 4 * * * /usr/local/bin/purge_relay_logs --user=monitor --password=monitor123 -disable_relay_log_purge --workdir=/tmp/ >> /tmp/purge_relay_logs.log 2>&1
注意:最好是每台slave服务器在不同时间点执行该计划任务。
4. 将MysqLbinlog的路径添加到环境变量中
六、 检查SSH的配置
在Monitor host上执行
# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Wed Jul 14:33:36 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. SkipPing.Wed Jul 2016 - [info] Reading application default configuration from /etc/masterha/app1.cnf..Wed Jul info] Reading server configuration from /etc/masterha/] Starting SSH connection tests..Wed Jul 51 2016 - [deBUG] Wed Jul 2016 - [deBUG] Connecting via SSH from root@244.10(244.10:22) to root@244.20(244.20:22)..Wed Jul 48 [deBUG] ok.Wed Jul 244.30(244.30:50 55 37 49 54 info] All SSH connection tests passed successfully.VIEw Code
七、查看整个集群的状态
在Monitor host上执行
# masterha_check_repl --conf=/etc/masterha/app1.cnf
Wed Jul 44:30 info] MHA::MasterMonitor version 0.56.Wed Jul 31 info] GTID failover mode = Wed Jul ] Dead Servers:Wed Jul ] Alive Servers:Wed Jul info] )Wed Jul ] Alive Slaves:Wed Jul 3306) Version=5.6.31 (oldest major version between slaves) log-bin:DisabledWed Jul info] Replicating from info] Primary candIDate the new Master (candIDate_master is set)Wed Jul info] Current Alive Master: ] Checking slave configurations..Wed Jul 2016 - [warning] log-bin is not set on slave ). This host cannot be a master.Wed Jul ] Checking replication filtering settings..Wed Jul info] binlog_do_db=,binlog_ignore_db= Wed Jul ] Replication filtering check ok.Wed Jul 2016 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm,ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnfWed Jul 482016 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm,ln523] Error happened on monitoring servers.Wed Jul info] Got exit code 1 (Not master dead).MysqL Replication Health is NOT OK!VIEw Code
报错很明显,Candicate master和Slave都没有启动log-bin,如果没有启动的话,后续就无法提升为主
设置log-bin后,重新执行:
Wed Jul 15:49:58 59 31-log (oldest major version between slaves) log-bin:enabledWed Jul info] GTID (with auto-pos) is not supportedWed Jul 50:17 ] All SSH connection tests passed successfully.Wed Jul ] Checking MHA Node version..Wed Jul 18 ] Version check ok.Wed Jul ] Checking SSH publickey authentication settings on the current master..Wed Jul info] HealthCheck: SSH to is reachable.Wed Jul 21 info] Master MHA Node version is info] Checking recovery script configurations on info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/MysqL --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=MysqLd-bin.info] Connecting to root@).. Creating /tmp not exists.. ok. Checking output directory is accessible or not.. ok. binlog found at /var/lib/MysqL,up to MysqLd-bin.23 info] binlog setting check done] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..Wed Jul info] Executing command : apply_diff_relay_logs --command=test --slave_user=monitor' --slave_host=244.20 --slave_ip=244.20 --slave_port=3306 --workdir=/tmp --target_version=31-log --manager_version=0.56 --relay_log_info=/var/lib/MysqL/relay-log.info --relay_dir=/var/lib/MysqL/ --slave_pass=xxxWed Jul ).. Checking slave recovery environment settings.. opening /var/lib/MysqL/relay-log. ... ok. Relay log found at /var/lib/MysqL,up to MysqLd-relay-bin. Temporary relay log file is /var/lib/MysqL/MysqLd-relay-bin. Testing MysqL connection and privileges..Warning: Using a password on the command line interface can be insecure. . Testing MysqLbinlog output.. . Cleaning up test file(s).. 28 244.30 --slave_ip=244.30 --slave_port=00000832 info] Slaves settings check ] ) (current master) +--) +--)Wed Jul info] Checking replication health on ..Wed Jul ] ok.Wed Jul ] Checking master_ip_failover_script status:Wed Jul info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=244.10 --orig_master_ip=244.10 --orig_master_port=] OK.Wed Jul [warning] shutdown_script is not defined.Wed Jul (Not master dead).MysqL Replication Health is OK.VIEw Code
检查通过~
八、 检查MHA Manager的状态
# masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING).
如果正常,会显示“Ping_OK”,否则会显示“NOT_RUNNING”,代表MHA监控还没有开启。
九、开启MHA Manager监控
# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 2>&1 &
其中,
remove_dead_master_conf:该参数代表当发生主从切换后,老的主库的IP将会从配置文件中移除。
ignore_last_failover:在默认情况下,MHA发生切换后将会在/masterha/app1下产生app1.failover.complete文件,下次再次切换的时候如果发现该目录下存在该文件且两次切换的时间间隔不足8小时的话,将不允许触发切换。除非在第一次切换后手动rm -rf /masterha/app1/app1.failover.complete。该参数代表忽略上次MHA触发切换产生的文件。
查看MHA Manager监控是否正常
# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pID:1873) is running(0:Ping_OK),master:244.10
十、 关闭MHA Manager监控
# masterha_stop --conf=/etc/masterha/app1.cnf Stopped app1 successfully.[1]+ Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /masterha/app1/manager.log 1
至此,MHA部分配置完毕,下面,来配置VIP。
十一、VIP配置
VIP配置可以采用两种方式,一是通过引入Keepalived来管理VIP,另一种是在脚本中手动管理。
对于keepalived管理VIP,存在脑裂情况,即当主从网络出现问题时,slave会抢占VIP,这样会导致主从数据库都持有VIP,造成IP冲突,所以在网络不是很好的情况下,不建议采用keepalived服务。
在实际生产中使用较多的也是第二种,即在脚本中手动管理VIP,所以,对keepalived不感兴趣的童鞋可直接跳过第一种方式。
1. keepalived管理VIP
1> 安装keepalived
因为我这里设置了Candicate master,故只在Master和Candicate master上安装。
如果没有Candicate master,两个Slave的地位平等,则两个Slave上都需安装keepalived。
# wget http://www.keepalived.org/software/keepalived-1.2.24.tar.gz
# tar xvf keepalived-1.2.24.tar.gz
# cd keepalived-1.2.24
# ./configure --prefix=/usr/local/keepalived
# make
# make install
# cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/rc.d/init.d/
# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
# mkdir /etc/keepalived
# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
2> 为keepalived设置单独的日志文件(非必需)
keepalived的日志默认是输出到/var/log/message中
# vim /etc/sysconfig/keepalived
KEEPAliVED_OPTIONS="-D -d -S 0"
设置syslog
# vim /etc/rsyslog.conf
添加如下内容:
local0.* /var/log/keepalived.log
# service rsyslog restart
2> 配置keepalived
在Master上修改
# vim /etc/keepalived/keepalived.conf
global_defs { notification_email { [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_ID MysqL-HA}vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_ID 51 priority 150 advert_int nopreempt authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 244.188/24 }}VIEw Code
关于keepalived的参数的详细介绍,可参考:
LVS+Keepalived搭建MyCAT高可用负载均衡集群
keepalived工作原理和配置说明
将配置文件scp到Candicate master上
# scp /etc/keepalived/keepalived.conf 192.168.244.20:/etc/keepalived/
只需将配置文件中的priority设置为90
注意:我们为什么在这里设置keepalived为backup模式呢?
在master-backup模式下,如果主库宕掉,VIP会自动漂移到Slave上,当主库修复,keepalived启动后,还会将VIP抢过来,即使设置了nopreempt(不抢占)的方
式,该动作仍会发生。但在backup-backup模式下,当主库修改,并启动keepalived后,并不会抢占新主的VIP,即便原主的priority高于新主的。
3> 启动keepalived
先在Master上启动
# service keepalived start
env: /etc/init.d/keepalived: Permission denIEd
# chmod +x /etc/init.d/keepalived
# service keepalived start
查看绑定情况
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNowN link/loopback 00 brd inet 0.1/8 scope host lo inet6 ::1/128 scope host valID_lft forever preferred_lft forever2: eth0: <broADCAST,MulTICAST,1)">1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:c6:47:04 brd ff:ff:ff:ff:ff:ff inet 244.10/24 brd 244.255 scope global eth0 inet scope global secondary eth0 inet6 fe80::20c:29ff:fec6:4704/64 scope link valID_lft forever preferred_lft foreverVIEw Code
可见,VIP(192168.244.188)已经绑定到Master的eth0网卡上了。
启动Candicate master的keepalived
# service keepalived start
4> MHA中引入keepalived
编辑/usr/local/bin/master_ip_failover
相对于原文件,修改地方为93-95行
1 #!/usr/bin/env perl 2 3 # copyright (C) 2011 DeNA Co.,Ltd. 4 # 5 # This program is free software; you can redistribute it and/or modify 6 # it under the terms of the GNU General Public license as published by 7 # the Free Software Foundation; either version 2 of the license,or 8 # (at your option) any later version. 9 10 # This program is distributed the hope that it will be useful, 11 # but WITHOUT ANY WARRANTY; without even the implIEd warranty of 12 # MERCHANTABIliTY or fitness FOR A PARTIculaR PURPOSE. See the 13 # GNU General Public license for more details. 14 15 # You should have received a copy of the GNU General Public license 16 # along with this program; if not,write to the Free Software 17 # Foundation,Inc.,1)"> 18 # 51 Franklin Street,Fifth Floor,Boston,MA 02110-1301 USA 19 20 ## Note: This is a sample script and is not complete. Modify the script based on your environment. 21 22 use strict; 23 use warnings FATAL => all; 24 25 use Getopt::Long; 26 use MHA::DBHelper; 27 my ( 28 $command,$ssh_user,$orig_master_host,1)"> 29 $orig_master_ip,$orig_master_port,$new_master_host,1)"> 30 $new_master_ip,$new_master_port,$new_master_user,1)"> 31 $new_master_password 32 ); 33 34 Getoptions( 35 command=s' => \$command,1)"> 36 ssh_user=s' => \$ssh_user,1)"> 37 orig_master_host=s' => \$orig_master_host,1)"> 38 orig_master_ip=s' => \$orig_master_ip,1)"> 39 orig_master_port=i \$orig_master_port,1)"> 40 new_master_host=s' => \$new_master_host,1)"> 41 new_master_ip=s' => \$new_master_ip,1)"> 42 new_master_port=i \$new_master_port,1)"> 43 new_master_user=s \$new_master_user,1)"> 44 new_master_password=s' => \$new_master_password,1)"> 45 46 47 exit &main(); 48 49 sub main { 50 if ( $command eq stop" || $command eq stopssh" ) { 51 52 # $orig_master_host,$orig_master_ip,$orig_master_port are passed. 53 # If you manage master ip address at global catalog database,1)"> 54 # invalIDate orig_master_ip here. 55 my $exit_code = 56 eval { 57 58 # updating global catalog,etc 59 $exit_code = 60 }; 61 ($@) { 62 warn Got Error: $@\n 63 exit $exit_code; 64 } 65 exit $exit_code; 66 } 67 elsif ( $command eq start 68 69 # all arguments are passed. 70 71 # activate new_master_ip here. 72 # You can also grant write access (create user,set read_only= 73 my $exit_code = 10 74 75 my $new_master_handler = new MHA::DBHelper(); 76 77 # args: hostname 78 $new_master_handler->connect( $new_master_ip,1)"> 79 $new_master_user,$new_master_password,1)"> ); 80 81 ## Set read_only= on the new master 82 $new_master_handler->disable_log_bin_local(); 83 print Set read_only=0 on the new master.\n 84 $new_master_handler->disable_read_only(); 85 86 ## Creating an app user on the new master 87 #print Creating app user on the new master..\n 88 #FIXME_xxx_create_user( $new_master_handler->{dbh} ); 89 $new_master_handler->enable_log_bin_local(); 90 $new_master_handler->disconnect(); 91 92 ## Update master ip on the catalog database,1)"> 93 my $cmd; 94 $cmd = ssh '.$ssh_user.@'.$orig_master_ip. service keepalived stop 95 system($cmd); 96 $exit_code = 97 98 99 warn $@;100 101 # If you want to continue failover,exit .102 103 104 105 106 elsif ( $command eq status107 108 # do nothing109 exit 110 111 else {112 &usage();113 exit 114 115 }116 117 sub usage {118 print119 Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n120 }VIEw Code
2. 通过脚本的方式管理VIP
编辑/usr/local/bin/master_ip_failover
#!/usr/bin/perl# copyright (C) or modify# it under the terms of the GNU General Public license as published by# the Free Software Foundation; either version details.## You should have received a copy of the GNU General Public license# along with this program; to the Free Software# Foundation,# USA## Note: This is a sample script and is not complete. Modify the script based on your environment.use strict;use warnings FATAL => ;use Getopt::Long;use MHA::DBHelper;my ( $command,$new_master_ip,$new_master_password);my $vip = 192.168.244.188;my $key = 2;my $ssh_start_vip = /sbin/ifconfig eth0:$key $vip/24;my $ssh_stop_vip = /sbin/ifconfig eth0:$key down;my $ssh_send_garp = /sbin/arPing -U $vip -I eth0 -c 1;Getoptions( main();sub main { ) { # $orig_master_host,$orig_master_port are passed. # If you manage master ip address at global catalog database,# invalIDate orig_master_ip here. my $exit_code = ; eval { print disabling the VIP an old master: $orig_master_host \n; &stop_vip(); $exit_code = ; }; ($@) { warn ; exit $exit_code; } exit $exit_code; } elsif ( $command eq ) { # all arguments are passed. # If you manage master ip address at global catalog database,# activate new_master_ip here. # You can also grant ; eval { my $new_master_handler = new MHA::DBHelper(); # args: ); ## Set read_only= on the new master $new_master_handler->disable_log_bin_local(); print ; $new_master_handler->disable_read_only(); ## Creating an app user on the new master # print ; # FIXME_xxx_create_user( $new_master_handler->{dbh} ); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); print Enabling the VIP $vip on the new master: $new_master_host \nstart_vip(); $exit_code = ($@) { warn $@; # If you want to continue failover,exit . exit $exit_code; } exit $exit_code; } elsif ( $command eq ) { # nothing exit ; } { &usage(); exit ; }}sub start_vip(){ `ssh $ssh_user\@$new_master_host \ $ssh_start_vip \"`; ` $ssh_send_garp \"`;}sub stop_vip(){
return 0 unless ($ssh_user); `ssh $ssh_user\@$orig_master_host \ $ssh_stop_vip \"`;}sub usage { print;}
实际生产环境中,推荐这种方式来管理VIP,可有效防止脑裂情况的发生。
至此,MHA高可用环境基本搭建完毕。
关于MHA的常见 *** 作,包括自动Failover,手动Failover,在线切换,可参考另一篇博客:
MHA在线切换的步骤和原理
MHA自动Failover与手动Failover的实践及原理
总结:
1. 可单独调试master_ip_failover,master_ip_online_change,send_report等脚本
/usr/local/bin/master_ip_online_change --command=stop --orig_master_ip=244.10 --orig_master_host=244.10 --orig_master_port=3306 --orig_master_user=monitor --orig_master_password=monitor123 --orig_master_ssh_user=root --new_master_host=244.20 --new_master_ip=244.20 --new_master_port=3306 --new_master_user=monitor --new_master_password=monitor123 --new_master_ssh_user=root
/usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=3306 --new_master_host=3306 --new_master_user=' --new_master_password=monitor123'
2. 官方对于master_ip_failover,master_ip_online_change,send_report脚本,给出的只是sample,切换的逻辑需要自己定义。
很多童鞋对perl并不熟悉,觉得无从下手,其实,完全可以调用其它脚本,譬如python,shell等。
如:
[root@node4 ~]# cat test.pl#!/usr/bin/use strict;my $cmd=python /root/test.py;system($cmd);[root@node4 ~]# test.py#!/usr/bin/pythonprint hello,python[root@node4 ~]# test.plhello,python
参考:
《深入浅出MysqL》
附:
master_ip_online_change
#!/usr/bin/env perl# copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public license as published by# the Free Software Foundation; either version 2 of the license,or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implIEd warranty of# MERCHANTABIliTY or fitness FOR A PARTIculaR PURPOSE. See the# GNU General Public license for more details.## You should have received a copy of the GNU General Public license# along with this program; if not,write to the Free Software# Foundation,# 51 Franklin Street,MA 02110-1301 USA## Note: This is a sample script and is not complete. Modify the script based on your environment.use strict;use warnings FATAL => 'all;use Getopt::Long;use MHA::DBHelper;NodeUtil;use Time::HiRes qw( sleep gettimeofday tv_interval );use Data::Dumper;my $_tstart$_running_interval = $vip = 192.168.244.188$key = "2$ssh_start_vip = /sbin/ifconfig eth0:$key $vip/24$ssh_stop_vip = /sbin/ifconfig eth0:$key down$ssh_send_garp = /sbin/arPing -U $vip -I eth0 -c 1my ( $command,1)">$orig_master_is_new_slave,1)">$orig_master_host,1)">$orig_master_ip,1)">$orig_master_port,1)">$orig_master_user,1)">$orig_master_password,1)">$orig_master_ssh_user,1)">$new_master_host,1)">$new_master_ip,1)">$new_master_port,1)">$new_master_user,1)">$new_master_password,1)">$new_master_ssh_user,1)">);Getoptions( command=s' => \orig_master_is_new_slave' => \orig_master_host=s' => \orig_master_ip=s' => \orig_master_port=iorig_master_user=sorig_master_password=s' => \orig_master_ssh_user=snew_master_host=s' => \new_master_ip=s' => \new_master_port=inew_master_user=snew_master_password=s' => \new_master_ssh_user=s);exit &main();sub start_vip(){ `ssh $new_master_ssh_user\@$new_master_host \ $ssh_start_vip \"`; `ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`;}sub stop_vip(){ `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;}sub current_time_us { my ( $sec,$microsec ) = gettimeofday(); my $curdate = localtime($sec); return $curdate . " . sprintf( "%06d,$microsec );}sub sleep_until { my $elapsed = tv_interval($_tstart); if ( $_running_interval > $elapsed ) { sleep( $_running_interval - $elapsed ); }}sub get_threads_util { my $dbh = shift; my $my_connection_ID = shift; my $running_time_threshold = shift; my $type = shift; $running_time_threshold = 0 unless ($running_time_threshold); $type = 0 unless ($type); my @threads; my $sth = $dbh->prepare("SHOW PROCESSList); $sth->execute(); while ( my $ref = $sth->fetchrow_hashref() ) { my $ID = $ref->{ID}; my $user = $ref->{User}; my $host = $ref->{Host}; my $command = $ref->{Command}; my $state = $ref->{State}; my $query_time = $ref->{Time}; my $info = $ref->{Info}; $info =~ s/^\s*(.*?)\s*$// if defined($info); next if ( $my_connection_ID == $ID ); next if ( defined($query_time) && $query_time < $running_time_threshold ); next if ( defined($command) && $command eq "binlog Dump ); next if ( defined($user) && $user eq system user ); next if ( defined($command) && $command eq Sleep && defined($query_time) && $query_time >= 1 ); if ( $type >= 1 ) { next if ( defined($command) && $command eq ); next if ( defined($command) && $command eq Connect ); } if ( $type >= 2 ) { next if ( defined($info) && $info =~ m/^select/i ); next if ( defined($info) && $info =~ m/^show/i ); } push @threads,$ref; } return @threads;}sub main { if ( $command eq "stop ) { ## Gracefully killing connections on the current master # 1. Set read_only= 1 on the new master # 2. DROP USER so that no app user can establish new connections # 3. Set read_only= 1 on the current master # 4. Kill current querIEs # * Any database access failure will result in script dIE. my $exit_code = 1; eval { ## Setting read_only=1 on the new master (to avoID accIDent) my $new_master_handler = new MHA::DBHelper(); # args: hostname,raise_error(dIE_on_error)_or_not $new_master_handler->connect( $new_master_ip,1 ); print current_time_us() . " Set read_only on the new master.. ; $new_master_handler->enable_read_only(); if ( $new_master_handler->is_read_only() ) { print "ok.\n; } else { dIE "Failed!\n; } $new_master_handler->disconnect(); # Connecting to the orig master,dIE if any database error happens my $orig_master_handler = new MHA::DBHelper(); $orig_master_handler->connect( $orig_master_ip,$orig_master_user,$orig_master_password,1 ); ## Drop application user so that nobody can connect. disabling per-session binlog beforehand $orig_master_handler->disable_log_bin_local(); # print current_time_us() . " DrpPing app user on the orig master..\n; #drop_app_user($orig_master_handler); ## Waiting for N * 100 milliseconds so that current connections can exit my $time_until_read_only = 15; $_tstart = [gettimeofday]; my @threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_ID} ); while ( $time_until_read_only > 0 && $#threads >= 0 ) { if ( $time_until_read_only % 5 == 0 ) { printf"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n"\n foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_read_only--; @threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_ID} ); } ## Setting read_only=1 on the current master so that nobody(except SUPER) can write print current_time_us() . " Set read_only=1 on the orig master.. ; $orig_master_handler->enable_read_only(); if ( $orig_master_handler->is_read_only() ) { print ; } ## Waiting for M * 100 milliseconds so that current update querIEs can complete my $time_until_kill_threads = 5; @threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_ID} ); while ( $time_until_kill_threads > 0 && $#threads >= 0 ) { if ( $time_until_kill_threads % 5 == 0 ) { printf%d querIEs are disconnected.. (max foreach (@threads); } } sleep_until(); $_tstart = [gettimeofday]; $time_until_kill_threads--; @threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_ID} ); } ## Terminating all threads print current_time_us() . " Killing all application threads..\n; $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 ); print current_time_us() . " done.\n; $orig_master_handler->enable_log_bin_local(); $orig_master_handler->disconnect(); ## DroPing the VIP print "disabling the VIP an old master: $orig_master_host \n; &stop_vip(); ## After finishing the script,MHA executes FLUSH tableS WITH READ LOCK $exit_code = 0; }; if ($@) { warn "Got Error: $@\n; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start ) { ## Activating master ip on the new master # 1. Create app user with write privileges # 2. Moving backup script if needed # 3. Register new master's ip to the catalog database# We don't return error even though activating updatable accounts/ip Failed so that we don't interrupt slaves' recovery.# If exit code is 0 or 10,MHA does not abort my $exit_code = 10; eval { my $new_master_handler = new MHA::DBHelper(); # args: hostname,raise_error_or_not $new_master_handler->connect( $new_master_ip,1 ); ## Set read_only=0 on the new master $new_master_handler->disable_log_bin_local(); print current_time_us() . 0 on the new master.\n; $new_master_handler->disable_read_only(); ## Creating an app user on the new master #print current_time_us() . " Creating app user on the new master..\n; # create_app_user($new_master_handler); print "Enabling the VIP $vip on the new master: $new_master_host \n; &start_vip(); $new_master_handler->enable_log_bin_local(); $new_master_handler->disconnect(); ## Update master ip on the catalog database,etc $exit_code = 0; }; if ($@) { warn "status ) { # do nothing exit 0; } else { &usage(); exit 1; }}sub usage { print"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n; dIE;}VIEw Code
master_ip_failover
$ssh_user,1)">$new_master_password;Getoptions( ' => \ssh_user=s' => \' => \' => \ main { if ( $command eq stop" || stopssh ) { $orig_master_host,# invalIDate orig_master_ip here. $exit_code = ; eval { print disabling the VIP an old master: $orig_master_host \nstop_vip(); ($@) { warn Got Error: $@\n; exit $exit_code; } elsif ( start all arguments are passed. # If you manage master ip address at global catalog database,# activate new_master_ip here. # You can also grant write access (create user,set read_only=0,etc) here. { $new_master_handler = new MHA::DBHelper(); args: hostname,raise_error_or_not $new_master_handler->connect( ); # Set read_only=0 on the new master disable_log_bin_local(); Set read_only=0 on the new master.\n; disable_read_only(); # Creating an app user on the new master # print "Creating app user on the new master..\n"; # FIXME_xxx_create_user( $new_master_handler->{dbh} ); enable_log_bin_local(); disconnect(); Enabling the VIP $vip on the new master: $new_master_host \nstart_vip(); warn $@; If you want to continue failover,exit 10. status do nothing usage(); ; }}$ssh_user\@ $ssh_start_vip \"`; `ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`;}sub stop_vip(){ return 0 unless ($ssh_user); `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;}sub usage { print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n;}VIEw Code
masterha_secondary_check
!/bin/env perl# copyright (C) 2011 DeNA Co.,MA 02110-1301 USAuse English qw(-no_match_vars);use Pod::Usage;ManagerConst;@monitoring_servers$help,1)">$version,1)">$ssh_port,1)">$ssh_options,1)">$master_host,1)">$master_ip,1)">$master_port,1)">$master_user,1)">$master_password,1)">$Ping_type);$timeout = 5;$| = ;Getoptions( help' => \version' => \secondary_host=s' => \@monitoring_servers,1); Font-weight: bold">user=sport=soptions=smaster_host=smaster_ip=smaster_port=imaster_user=smaster_password=sPing_type=s$Ping_type,1); Font-weight: bold">timeout=i$timeout,1)">if ($version) { masterha_secondary_check version $MHA::ManagerConst::VERSION.\n; ;}$help) { pod2usage();}unless ($master_host exit_by_signal { ;}local $SIG{INT} = $SIG{HUP} = $SIG{QUIT} = $SIG{TERM} = \&exit_by_signal;$ssh_user = root$ssh_user);$ssh_port = 22 $ssh_port$master_port = 3306 $master_port);$ssh_options) { $MHA::ManagerConst::SSH_OPT_CHECK = ;}$MHA::ManagerConst::SSH_OPT_CHECK =~ s/VAR_CONNECT_TIMEOUT/$timeout/ 0: master is not reachable from all monotoring servers# 1: unkNown errors# 2: at least one of monitoring servers is not reachable from this script# 3: master is reachable from at least one of monitoring serversforeach $monitoring_server ($ssh_user_host = $ssh_user . @' . $monitoring_server$command =ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \"perl -e " . \\"use IO::Socket::INET; my \\\$sock = IO::Socket::INET->new(PeerAddr => \\\\\\\"$master_host\\\\\\\",PeerPort=> $master_port,1); Font-weight: bold">Proto =>'tcp',Timeout => $timeout); if(\\\\\\\$sock) { close(\\\\\\\$sock); exit 3; } exit 0;\\" \"$ret = system($command); $ret = $ret >> $ret == ) { printMonitoring server $monitoring_server is reachable,Master is not reachable from $monitoring_server. OK.\nnext3if ( defined $Ping_type && $Ping_type eq $MHA::ManagerConst::Ping_TYPE_INSERT ) { $ret_insert$command_insert = ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \'" . /usr/bin/MysqL -u$master_user -p$master_password -h$master_host -e \"CREATE DATABASE IF NOT EXISTS infra; CREATE table IF NOT EXISTS infra.chk_masterha (\`key\` tinyint NOT NulL primary key,\`val\` int(10) unsigned NOT NulL DEFAulT '0'\) engine=MyISAM; INSERT INTO infra.chk_masterha values (1,unix_timestamp()) ON DUPliCATE KEY UPDATE val=unix_timestamp()\"\'$sigalrm_timeout = { $SIG{ALRM} = { dIE timeout.\n; }; alarm $sigalrm_timeout; $ret_insert = $command_insert); $ret_insert = $ret_insert >> ; ; }; if ( $@ || $ret_insert != ) { ; } } Master is reachable from $monitoring_server!\n; last { Monitoring server $monitoring_server is NOT reachable!\n ############################################################################# documentation# ############################################################################=pod=head1 namemasterha_secondary_check - Checking master availability from additional network routes=head1 SYnopSISmasterha_secondary_check -s secondary_host1 -s secondary_host2 .. --user=ssh_username --master_host=host --master_ip=ip --master_port=portSee online reference (http://code.Google.com/p/MysqL-master-ha/wiki/Parameterssecondary_check_script) for details.=head1 DESCRIPTIONSee online reference (http://code.Google.com/p/MysqL-master-ha/wiki/Parameterssecondary_check_script) for details.VIEw Code
send_report
!/usr/bin/perl# copyright (C) 2011 DeNA Co.,1)">use Mail::Sender;Long;new_master_host and new_slave_hosts are set only when recovering master succeededmy ( $dead_master_host,1)">$new_slave_hosts,1)">$subject,1)">$body$smtp=smtp.126.com$mail_from=[email protected]$mail_user=$mail_pass=xxxxx$mail_to=[];Getoptions( new_slave_hosts=ssubject=sbody=s$body,1)">);mailToContacts($smtp,1)">$mail_from,1)">$mail_user,1)">$mail_pass,1)">$mail_to,1)"> mailToContacts { $user,1)">$passwd,1)">$msg ) = @_open $DEBUG,1); Font-weight: bold">> /tmp/monitormail.log or Can't open the deBUG file:$!\n$sender = new Mail::Sender { ctype => text/plain; charset=utf-8',1)"> enCoding => utf-8 smtp => from => auth => LOGIN TLS_allowed => 0 authID => authpwd => to => subject => deBUG => $DEBUG }; $sender->MailMsg( { msg => $msg,1)"> deBUG => } ) or print $Mail::Sender::Error; return ;} Do whatever you want here0;VIEw Code
总结
以上是内存溢出为你收集整理的MySQL高可用方案MHA的部署和原理全部内容,希望文章能够帮你解决MySQL高可用方案MHA的部署和原理所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)