MySQL 主从同步02-percona-toolkit工具(数

2021-02-24 本文已影响0人轻飘飘D

1.下载 percona-toolkit工具 & 安装(xag201,xag202)

[root@xag201 src]# 
wget https://downloads.percona.com/downloads/percona-toolkit/3.3.0/binary/redhat/7/x86_64/percona-toolkit-3.3.0-1.el7.x86_64.rpm

安装该工具依赖的软件包
yum install perl-IO-Socket-SSL perl-DBD-MySQL perl-Time-HiRes perl perl-DBI -y
yum install perl-TermReadKey -y

安装后，percona-toolkit工具的各个组件命令就有有了（输入ht-，按TAB键就会显示）
rpm -ivh percona-toolkit-3.3.0-1.el7.x86_64.rpm

[root@xag201 src]# pt-
pt-align                  pt-duplicate-key-checker  pt-heartbeat              
pt-mext                   pt-pg-summary             pt-sift                   
pt-summary                pt-variable-advisor
pt-archiver               pt-fifo-split             pt-index-usage            
pt-mongodb-query-digest   pt-pmp                    pt-slave-delay            
pt-table-checksum         pt-visual-explain
pt-config-diff            pt-find                   pt-ioprofile              
pt-mongodb-summary        pt-query-digest           pt-slave-find             
pt-table-sync             
pt-deadlock-logger        pt-fingerprint            pt-k8s-debug-collector    
pt-mysql-summary          pt-secure-collect         pt-slave-restart          
pt-table-usage            
pt-diskstats              pt-fk-error-logger        pt-kill                   
pt-online-schema-change   pt-show-grants            pt-stalk                  
pt-upgrade

2.pt-table-checksum使用梳理(用于检测MySQL主、从库的数据是否一致)

在主库执行授权（一定要对主库ip授权(主备都对'root'@'xag201' 授权，授权的用户名和密码可以自行定义，不过要保证这个权限能同时登陆主库和从库）
mysql> 
GRANT SELECT, PROCESS, SUPER, REPLICATION SLAVE,CREATE,DELETE,INSERT,UPDATE ON *.* TO 'root'@'xag201' identified by '123456';

mysql> flush privileges;

在从库上执行授权
mysql> 
GRANT SELECT, PROCESS, SUPER, REPLICATION SLAVE ON *.* TO 'root'@'xag201' IDENTIFIED BY '123456';

mysql> flush privileges;

如下，在主库上执行的一个检查主从数据一致性的命令
(别忘了第一次运行的时候需要添加--create-replicate-table参数，后续再运行时就不需要加了)：
下面命令中的xag201是主库ip
检查的是testdb库下的t1表的数据（当然，命令中也可以不跟表，直接检查某整个库的数据；
如下去掉--tables=t1表，直接检查testdb库的数据）

常用参数解释：
--nocheck-replication-filters ：不检查复制过滤器，建议启用。后面可以用--databases来指定需要检查的数据库。
--no-check-binlog-format : 不检查复制的binlog模式，要是binlog模式是ROW，则会报错。
--replicate-check-only :只显示不同步的信息。
--replicate= ：把checksum的信息写入到指定表中，建议直接写到被检查的数据库当中。
--databases= ：指定需要被检查的数据库，多个则用逗号隔开。
--tables= ：指定需要被检查的表，多个用逗号隔开
h= ：Master的地址
u= ：用户名
p=：密码
P= ：端口

#首次
[root@xag201 src]# pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --create-replicate-table --databases=testdb --tables=t1 h=xag201,u=root,p=123456,P=3306
Checking if all tables can be checksummed ...
Starting checksum ...
*******************************************************************
 Using the default of SSL_verify_mode of SSL_VERIFY_NONE for client
 is deprecated! Please set SSL_verify_mode to SSL_VERIFY_PEER
 possibly with SSL_ca_file|SSL_ca_path for verification.
 If you really don't want to verify the certificate and keep the
 connection open to Man-In-The-Middle attacks please set
 SSL_verify_mode explicitly to SSL_VERIFY_NONE in your application.
*******************************************************************
  at /usr/bin/pt-table-checksum line 332.
*******************************************************************
 Using the default of SSL_verify_mode of SSL_VERIFY_NONE for client
 is deprecated! Please set SSL_verify_mode to SSL_VERIFY_PEER
 possibly with SSL_ca_file|SSL_ca_path for verification.
 If you really don't want to verify the certificate and keep the
 connection open to Man-In-The-Middle attacks please set
 SSL_verify_mode explicitly to SSL_VERIFY_NONE in your application.
*******************************************************************
  at /usr/bin/pt-table-checksum line 332.

# A software update is available:
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
02-16T20:23:07      0      0        5          0       1       0   0.030 testdb.t1

##非首次
[root@xag201 src]# pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb --tables=t1 h=xag201,u=root,p=123456,P=3306
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
02-16T20:23:45      0      0        5          0       1       0   0.019 testdb.t1

解释：
TS ：完成检查的时间。
ERRORS ：检查时候发生错误和警告的数量。
DIFFS ：0表示一致，1表示不一致。当指定--no-replicate-check时，会一直为0，当指定--replicate-check-only会显示不同的信息。
ROWS ：表的行数。
CHUNKS ：被划分到表中的块的数目。
SKIPPED ：由于错误或警告或过大，则跳过块的数目。
TIME ：执行的时间。
TABLE ：被检查的表名。

3.创建 test_scheduler 表

create table test_scheduler
(
  sch_seq int auto_increment not null,
  hostname varchar(50),
  server_id varchar(50),
  sch_createtime datetime,
  primary key(sch_seq)
);

insert into test_scheduler(hostname,server_id,sch_createtime) values(@@hostname,@@server_id,now());


#存储过程
DROP PROCEDURE IF EXISTS  proc_test_scheduler;

DELIMITER $$

create procedure proc_test_scheduler()
LANGUAGE SQL
DETERMINISTIC
MODIFIES SQL DATA
SQL SECURITY DEFINER
begin
delete from test_scheduler where server_id=cast(@@server_id as char(50)) and 
sch_createtime<date_sub(now(),interval 5 minute);
insert into test_scheduler(hostname,server_id,sch_createtime) values(@@hostname,@@server_id,now());
end$$

DELIMITER ;

call proc_test_scheduler();

#每60秒调用一次 proc_test_scheduler()
DROP EVENT IF EXISTS even_test_scheduler;
CREATE EVENT IF NOT EXISTS even_test_scheduler
ON SCHEDULE EVERY 60 SECOND
DO call proc_test_scheduler();

4.查看事件

root@xag201:testdb [:33: ] 41 SQL->
SELECT EVENT_SCHEMA,EVENT_NAME,EVENT_DEFINITION,INTERVAL_FIELD,STATUS,LAST_EXECUTED FROM information_schema.EVENTS\G
*************************** 1. row ***************************
    EVENT_SCHEMA: testdb
      EVENT_NAME: even_test_scheduler
EVENT_DEFINITION: call proc_test_scheduler()
  INTERVAL_FIELD: SECOND
          STATUS: ENABLED
   LAST_EXECUTED: 2021-02-17 00:32:42

root@xag202:testdb [:33: ] 16 SQL->
SELECT EVENT_SCHEMA,EVENT_NAME,EVENT_DEFINITION,INTERVAL_FIELD,STATUS,LAST_EXECUTED FROM information_schema.EVENTS\G
*************************** 1. row ***************************
    EVENT_SCHEMA: testdb
      EVENT_NAME: even_test_scheduler
EVENT_DEFINITION: call proc_test_scheduler()
  INTERVAL_FIELD: SECOND
          STATUS: SLAVESIDE_DISABLED
   LAST_EXECUTED: NULL

5.人为搞出不一致情况

#备库
root@xag202:testdb [:45: ] 7 SQL->stop slave;  

root@xag202:testdb [:45: ] 8 SQL-> reset slave;

root@xag202:testdb [:45: ] 9 SQL->show slave status\G;
*************************** 1. row ***************************
。。。
             Slave_IO_Running: No
            Slave_SQL_Running: No


root@xag201:testdb [:48: ] 49 SQL->show master status;
+---------------+----------+--------------+------------------+-------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000023 |    12081 |              |                  |                   |
+---------------+----------+--------------+------------------+-------------------+

root@xag202:testdb [:16: ] 6 SQL->
change master to master_host='xag201', master_port=3306, master_user='repl', master_password='rep123',master_log_file='binlog.000023',master_log_pos=12081;

root@xag202:testdb [:53: ] 13 SQL->start slave;

root@xag202:testdb [:53: ] 14 SQL->show slave status\G;
*************************** 1. row ***************************
。。。
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

6.检查不一致 & pt-table-sync用法梳理(高效的同步MySQL表之间的数据)

[root@xag201 ~]# 
pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb --tables=test_scheduler h=xag201,u=root,p=123456,P=3306
---------------------------------------------------------------------------------------------
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
02-17T00:57:08      0      1        8          4       1       0   0.015 testdb.test_scheduler
------------------------------------------------------------------------------------------------

参数解释：
--replicate= ：指定通过pt-table-checksum得到的表，这2个工具差不多都会一直用。
--databases= : 指定执行同步的数据库。
--tables= ：指定执行同步的表，多个用逗号隔开。
--sync-to-master ：指定一个DSN，即从的IP，他会通过show processlist或show slave status 去自动的找主。
h= ：服务器地址，命令里有2个ip，第一次出现的是Master的地址，第2次是Slave的地址。
u= ：帐号。
p= ：密码。
--print ：打印，但不执行命令。
--execute ：执行命令。

[root@xag201 ~]# pt-table-sync --replicate=testdb.checksums h=xag201,u=root,p=123456 h=xag202,u=root,p=123456 --print
DELETE FROM `testdb`.`test_scheduler` WHERE `sch_seq`='23' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:test_scheduler src_dsn:h=xag201,p=...,u=root dst_db:testdb dst_tbl:test_scheduler dst_dsn:h=xag212,p=...,u=root lock:1 transaction:1 changing_src:testdb.checksums replicate:testdb.checksums bidirectional:0 pid:2531 user:root host:xag201*/;
DELETE FROM `testdb`.`test_scheduler` WHERE `sch_seq`='25' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:test_scheduler src_dsn:h=xag201,p=...,u=root dst_db:testdb dst_tbl:test_scheduler dst_dsn:h=xag212,p=...,u=root lock:1 transaction:1 changing_src:testdb.checksums replicate:testdb.checksums bidirectional:0 pid:2531 user:root host:xag201*/;
DELETE FROM `testdb`.`test_scheduler` WHERE `sch_seq`='27' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:test_scheduler src_dsn:h=xag201,p=...,u=root dst_db:testdb dst_tbl:test_scheduler dst_dsn:h=xag212,p=...,u=root lock:1 transaction:1 changing_src:testdb.checksums replicate:testdb.checksums bidirectional:0 pid:2531 user:root host:xag201*/;
DELETE FROM `testdb`.`test_scheduler` WHERE `sch_seq`='29' LIMIT 1 /*percona-toolkit src_db:testdb src_tbl:test_scheduler src_dsn:h=xag201,p=...,u=root dst_db:testdb dst_tbl:test_scheduler dst_dsn:h=xag212,p=...,u=root lock:1 transaction:1 changing_src:testdb.checksums replicate:testdb.checksums bidirectional:0 pid:2531 user:root host:xag201*/;

上面命令介绍完了，接下来开始执行修复：
通过（--print）打印出来了修复数据的sql语句，可以手动的在slave从库上执行，
让他们数据保持一致性，这样比较麻烦！
可以直接在master主库上执行修复操作，通过--execute参数，如下：
[root@xag201 ~]# pt-table-sync --replicate=testdb.checksums h=xag201,u=root,p=123456 h=xag202,u=root,p=123456 --execute

如上修复后，再次检查，发现主从库数据已经一致了！
[root@xag201 ~]# pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb --tables=test_scheduler h=xag201,u=root,p=123456,P=3306
-----------------------------------------------------------------------------------------
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
02-17T01:03:23      0      0        8          0       1       0   0.014 testdb.test_scheduler
-----------------------------------------------------------------------------------------

7.编写监控脚本，定时检查。当检查到主从数据不一致时，强制修复数据（可选）

[root@xag201 ~]# /usr/bin/pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb  h=xag201,u=root,p=123456,P=3306
---------------------------------------------------------------------------------
Checking if all tables can be checksummed ...
Starting checksum ...
            TS ERRORS  DIFFS     ROWS  DIFF_ROWS  CHUNKS SKIPPED    TIME TABLE
02-17T11:14:07      0      0        5          0       1       0   0.016 testdb.t1
02-17T11:14:07      0      0        8          0       1       0   0.017 testdb.test_scheduler
--------------------------------------------------------------------------------------

[root@xag201 ~]# /usr/bin/pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb  h=xag201,u=root,p=123456,P=3306|awk -F" " '{print $3}'|sed -n '4p'
----------------------------------------
0
----------------------------------------
[root@xag201 ~]# vim pt_testdb.sh
[root@xag201 ~]# chmod +x pt_testdb.sh 

[root@xag201 ~]# ./pt_testdb.sh
data is ok

[root@xag201 ~]# cat /root/pt_testdb.sh
--------------------------------------------------------------------------------------------------------
#!/bin/bash
NUM=$(/usr/bin/pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --replicate=testdb.checksums --databases=testdb  h=xag201,u=root,p=123456,P=3306|awk -F" " '{print $3}'|sed -n '4p')
if [ $NUM -eq 1 ];then
  /usr/bin/pt-table-sync --replicate=testdb.checksums h=xag201,u=root,p=123456 h=xag202,u=root,p=123456 --print
  /usr/bin/pt-table-sync --replicate=testdb.checksums h=xag201,u=root,p=123456 h=xag202,u=root,p=123456 --execute
else
  echo "data is ok"
fi
------------------------------------------------------------------------------------------------------------

检查主从testdb库数据一致性(每隔10秒检查一次）

[root@xag201 ~]# crontab -l
* * * * * /bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1
* * * * * sleep 10;/bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1
* * * * * sleep 20;/bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1
* * * * * sleep 30;/bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1
* * * * * sleep 40;/bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1
* * * * * sleep 50;/bin/bash -x /root/pt_testdb.sh > /dev/null 2>&1

pt-heartbeat监控mysql主从复制延迟梳理

#主库
[root@xag201 ~]# pt-heartbeat --user=root --ask-pass --host=xag201 --create-table -D testdb --interval=3 --update --replace --daemonize

[root@xag201 ~]# ps -ef|grep pt-heartbeat
root       2573      1  0 11:40 ?        00:00:00 perl /usr/bin/pt-heartbeat --user=root --ask-pass --host=xag201 --create-table -D testdb --interval=3 --update --replace --daemonize

root@xag201:testdb [:41: ] 21 SQL->show create table heartbeat \G;
*************************** 1. row ***************************
       Table: heartbeat
Create Table: CREATE TABLE `heartbeat` (
  `ts` varchar(26) NOT NULL,
  `server_id` int(10) unsigned NOT NULL,
  `file` varchar(255) DEFAULT NULL,
  `position` bigint(20) unsigned DEFAULT NULL,
  `relay_master_log_file` varchar(255) DEFAULT NULL,
  `exec_master_log_pos` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`server_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

说明： heratbeat   表一直在更改ts和position,而ts是我们检查复制延迟的关键。

root@xag201:testdb [:41: ] 22 SQL->select * from heartbeat;
+----------------------------+-----------+---------------+----------+-----------------------+---------------------+
| ts                         | server_id | file          | position | relay_master_log_file | exec_master_log_pos |
+----------------------------+-----------+---------------+----------+-----------------------+---------------------+
| 2021-02-17T11:42:03.008650 |       201 | binlog.000024 |    67548 | binlog.000026         |                 154 |
+----------------------------+-----------+---------------+----------+-----------------------+---------------------+

[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456
1.01s [  0.02s,  0.00s,  0.00s ]
2.01s [  0.05s,  0.01s,  0.00s ]
0.00s [  0.05s,  0.01s,  0.00s ]
1.00s [  0.07s,  0.01s,  0.00s ]
1.99s [  0.10s,  0.02s,  0.01s ]
0.00s [  0.10s,  0.02s,  0.01s ]
1.00s [  0.12s,  0.02s,  0.01s ]

解释：0表示从没有延迟。 [ 0.00s, 0.00s, 0.00s ] 表示1m,5m,15m的平均值。可以通过--frames去设置

#备库
[root@xag202 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456
0.00s [  0.00s,  0.00s,  0.00s ]
1.00s [  0.02s,  0.00s,  0.00s ]
0.00s [  0.02s,  0.00s,  0.00s ]
0.00s [  0.02s,  0.00s,  0.00s ]
1.00s [  0.03s,  0.01s,  0.00s ]

pt-heartbeat命令格式
pt-heartbeat [OPTIONS] [DSN] --update|--monitor|--check|--stop

注意：需要指定的参数至少有 --stop，--update，--monitor，--check。
其中--update，--monitor和--check是互斥的，--daemonize和--check也是互斥。
--ask-pass     隐式输入MySQL密码
--charset     字符集设置
--check      检查从的延迟，检查一次就退出，除非指定了--recurse会递归的检查所有的从服务器。
--check-read-only    如果从服务器开启了只读模式，该工具会跳过任何插入。
--create-table    在主上创建心跳监控的表，如果该表不存在，可以自己手动建立，
                  建议存储引擎改成memory。通过更新该表知道主从延迟的差距。
--daemonize   执行时，放入到后台执行
--user=-u，   连接数据库的帐号
--database=-D，    连接数据库的名称
--host=-h，     连接的数据库地址
--password=-p，     连接数据库的密码
--port=-P，     连接数据库的端口
--socket=-S，    连接数据库的套接字文件
--file 【--file=output.txt】   打印--monitor最新的记录到指定的文件，很好的防止满屏幕都是数据的烦恼。
--frames 【--frames=1m,2m,3m】  在--monitor里输出的[]里的记录段，默认是1m,5m,15m。可以指定1个，如：--frames=1s，多个用逗号隔开。可用单位有秒（s）、分钟（m）、小时（h）、天（d）。
--interval   检查、更新的间隔时间。默认是见是1s。最小的单位是0.01s，最大精度为小数点后两位，因此0.015将调整至0.02。
--log    开启daemonized模式的所有日志将会被打印到制定的文件中。
--monitor    持续监控从的延迟情况。通过--interval指定的间隔时间，打印出从的延迟信息，通过--file则可以把这些信息打印到指定的文件。
--master-server-id    指定主的server_id，若没有指定则该工具会连到主上查找其server_id。
--print-master-server-id    在--monitor和--check 模式下，指定该参数则打印出主的server_id。
--recurse    多级复制的检查深度。模式M-S-S...不是最后的一个从都需要开启log_slave_updates，这样才能检查到。
--recursion-method     指定复制检查的方式,默认为processlist,hosts。
--update    更新主上的心跳表。
--replace     使用--replace代替--update模式更新心跳表里的时间字段，这样的好处是不用管表里是否有行。
--stop    停止运行该工具（--daemonize），在/tmp/目录下创建一个“pt-heartbeat-sentinel” 文件。后面想重新开启则需要把该临时文件删除，才能开启（--daemonize）。
--table   指定心跳表名，默认heartbeat。

11.在主库运行监测同步延迟

[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456
2.00s [  0.03s,  0.01s,  0.00s ]
0.00s [  0.03s,  0.01s,  0.00s ]
0.99s [  0.05s,  0.01s,  0.00s ]
2.00s [  0.08s,  0.02s,  0.01s ]
解释：0表示从没有延迟。 [ 0.00s, 0.00s, 0.00s ] 表示1m,5m,15m的平均值。可以通过--frames去设置。

加上--master-server-id参数（主库my.cnf里配置的server-id值）
[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456 --master-server-id=201
1.01s [  0.02s,  0.00s,  0.00s ]
2.00s [  0.05s,  0.01s,  0.00s ]
0.00s [  0.05s,  0.01s,  0.00s ]
0.99s [  0.07s,  0.01s,  0.00s ]
1.99s [  0.10s,  0.02s,  0.01s ]

将主库的server-id打印出来（--print-master-server-id）
[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456 --print-master-server-id
1.00s [  0.02s,  0.00s,  0.00s ] 201
2.00s [  0.05s,  0.01s,  0.00s ] 201
0.00s [  0.05s,  0.01s,  0.00s ] 201
0.99s [  0.07s,  0.01s,  0.00s ] 201

可以使用--check监测一次就退出
注意：使用了--check，就不能使用--monitor
--update，--monitor和--check是互斥的，--daemonize和--check也是互斥

[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --check --host=xag202 --user=root --password=123456 --print-master-server-id
2.00 201

[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --check --host=xag202 --user=root --password=123456
0.00


注意：
如果想把这个输出结果加入自动化监控，那么可以使用如下命令使监控输出写到文件，
然后使用脚本定期过滤文件中的最大值作为预警即可：
注意--log选项必须在有--daemonize参数的时候才会打印到文件中，
且这个文件的路径最好在/tmp下，否则可能因为权限问题无法创建

[root@xag201 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456 --log=/root/master-slave.txt --daemonize

[root@xag201 ~]# tail -f master-slave.txt 
2.00s [  0.15s,  0.03s,  0.01s ]
0.00s [  0.15s,  0.03s,  0.01s ]
1.00s [  0.17s,  0.03s,  0.01s ]

12.在从库上运行监测同步延迟
(也可以在命令后加上--master-server-id=201或--print-master-server-id，同上操作)

[root@xag202 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --user=root --password=123456
0.00s [  0.00s,  0.00s,  0.00s ]
0.00s [  0.00s,  0.00s,  0.00s ]
1.00s [  0.02s,  0.00s,  0.00s ]

[root@xag202 ~]# pt-heartbeat -D testdb --table=heartbeat --user=root --password=123456 --check
0.00
[root@xag202 ~]# pt-heartbeat -D testdb --table=heartbeat --monitor --user=root --password=123456 --log=/root/master-slave.txt --daemonize

[root@xag202 ~]# tail -f /root/master-slave.txt
0.99s [  0.02s,  0.00s,  0.00s ]
0.00s [  0.02s,  0.00s,  0.00s ]
0.00s [  0.02s,  0.00s,  0.00s ]

关闭上面执行的heartbeat更新进程

方法一：可以用参数--stop去关闭
[root@xag201 ~]# ps -ef|grep heartbeat
root       2573      1  0 11:40 ?        00:00:01 perl /usr/bin/pt-heartbeat --user=root --ask-pass --host=xag201 --create-table -D testdb --interval=3 --update --replace --daemonize
root       2637      1  0 12:07 ?        00:00:00 perl /usr/bin/pt-heartbeat -D testdb --table=heartbeat --monitor --host=xag202 --user=root --password=123456 --log=/root/master-slave.txt --daemonize

[root@xag201 ~]# pt-heartbeat --stop
Successfully created file /tmp/pt-heartbeat-sentinel

后续要继续开启后台进行的话，记住一定要先把/tmp/pt-heartbeat-sentinel 文件删除，否则启动不了

[root@xag201 ~]# ps -ef|grep heartbeat

方法二：直接kill掉进程pid（推荐这种方法）
[root@xag202 ~]# ps -ef|grep heartbeat
root       4545      1  0 12:18 ?        00:00:00 perl /usr/bin/pt-heartbeat -D testdb --table=heartbeat --monitor --user=root --password=123456 --log=/root/master-slave.txt --daemonize

[root@xag202 ~]# kill -9 4545

[root@xag202 ~]# ps -ef|grep heartbeat

监控脚本

[root@xag201 ~]# cat check-slave-monit.sh
------------------------------------------------------------------------
#!/bin/bash
cat /root/master-slave.txt > /root/master_slave.txt
echo > /root/master-slave.txt
max_time=`cat /root/master_slave.txt |grep -v '^$' |awk '{print $1}' |sort -k1nr |head -1`
NUM=$(echo "$max_time"|cut -d"s" -f1)
if [ $NUM == "0.00" ];then
   echo "Mysql主从数据一致"
else
   echo "Mysql主从同步延迟"
fi

--------------------------------------------------------------------------

[root@xag201 ~]# chmod +x check-slave-monit.sh 

[root@xag201 ~]# ./check-slave-monit.sh 
Mysql主从同步延迟

MySQL 主从同步02-percona-toolkit工具(数

猜你喜欢

热点阅读