分别在MySQL5.7和8.0中测试主从复制中主库表缺失主键会导致主从延迟的情况
之前发布过的一篇类似文章:【DB宝34】测试MySQL主从复制中主库表缺失主键会导致主从延迟的情况
简介
检查延迟的方法:在从库上通过SHOW SLAVE STATUS检查Seconds_Behind_Master值即可获取主从复制延迟的秒数。
主从复制延迟,可能的原因有主库和从库方面:
① 主库写binlog不及时。
② dump线程压力大
③ IO线程阻塞
④ 表缺乏主键或唯一索引(常见)
假设主库更新一张500w表中的20w行数据,该update语句仅需要全表扫描1次;而在row格式下,记录到binlog日志中的SQL为20w次update操作,此时SQL Thread重放将特别慢,因为每一次update都需要进行一次全表扫描,即从库需要执行20w次的全表扫描。
⑤ 主库DML请求频繁(tps较大)
⑥ 主库执行大事务,导致从库SQL慢
⑦ 主库对大表执行DDL语句
⑧ 主库与从库硬件配置不一致
⑨ 从库自身压力过大
⑩ MyISAM存储引擎
⑪ 主从复制的服务器时钟是否一致。主从同步延迟与系统时间的关系,查看主从两台机器间系统时间差
⑫ 网络通信是否存在延时。主从同步延迟与压力、网络、机器性能的关系,查看从库的IO,cpu,mem及网络压力
⑬ 从库查询是否优化(比如存在查询慢,导致从库性能差,处理不过来)
⑭ 是否启用了延迟复制,使用“show slave status”查看SQL_Delay是否大于0
今天我们就通过实验的方式来验证第4种情况。
MySQL 5.7 环境准备
MySQL环境初始化
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | -- 设置主从的网络环境 docker pull mysql:5.7.36 docker network create --subnet=172.72.5.0/24 mysql-network -- 删除之前的容器 rm -rf /lhrmysqltest3/master1/conf.d rm -rf /lhrmysqltest3/slave1/conf.d docker rm -f master1 slave1 -- 创建参数文件路径 mkdir -p /lhrmysqltest3/master1/conf.d mkdir -p /lhrmysqltest3/slave1/conf.d -- 配置主库参数 cat > /lhrmysqltest3/master1/conf.d/my.cnf <<"EOF" [mysqld] port=3306 character_set_server=utf8mb4 secure_file_priv= server-id = 5733650 log-bin = binlog_format=row skip-name-resolve gtid-mode=ON enforce-gtid-consistency=on report_host=172.72.5.50 default-time-zone = '+8:00' log_timestamps = SYSTEM log_slave_updates = 1 master_info_repository='table' relay_log_info_repository='table' relay_log_recovery=1 EOF -- 配置从库参数 cat > /lhrmysqltest3/slave1/conf.d/my.cnf <<"EOF" [mysqld] port=3306 character_set_server=utf8mb4 secure_file_priv= server-id = 5733651 log-bin = binlog_format=row gtid-mode=ON enforce-gtid-consistency=ON skip_name_resolve report_host=172.72.5.51 default-time-zone = '+8:00' log_timestamps = SYSTEM log_slave_updates = 1 master_info_repository='table' relay_log_info_repository='table' relay_log_recovery=1 EOF -- 申请主库环境 docker run -d --name master1 \ -h master1 -p 33650:3306 --net=mysql-network --ip 172.72.5.50 \ -v /lhrmysqltest3/master1/conf.d:/etc/mysql/conf.d \ -e MYSQL_ROOT_PASSWORD=lhr \ mysql:5.7.36 -- 申请从库环境 docker run -d --name slave1 \ -h slave1 -p 33651:3306 --net=mysql-network --ip 172.72.5.51 \ -v /lhrmysqltest3/slave1/conf.d:/etc/mysql/conf.d \ -e MYSQL_ROOT_PASSWORD=lhr \ mysql:5.7.36 -- 登陆 docker exec -it master1 bash docker exec -it master1 mysql -uroot -plhr -- 查询 mysql -uroot -plhr -h192.168.1.35 -P33650 -e "select @@hostname,@@server_id,@@server_uuid" mysql -uroot -plhr -h192.168.1.35 -P33651 -e "select @@hostname,@@server_id,@@server_uuid" |
主库配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | -- 主库创建复制用户repl mysql -uroot -plhr -h192.168.1.35 -P33650 grant replication slave on *.* to repl@'%' identified by 'lhr'; select user,host,grant_priv,password_last_changed,authentication_string from mysql.user; create database lhrdb; use lhrdb; create table t(id int,name varchar(30)); create table mytb1(id int,name varchar(30)); insert into mytb1 values(1,'a'),(2,'b'); -- 查询 show master status ; show slave hosts; select @@hostname,@@server_id,@@server_uuid; |
从库配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | mysql -uroot -plhr -h192.168.1.35 -P33651 -- 修改参数 change master to master_host='172.72.5.50', master_port=3306, master_user='repl', master_password='lhr', master_auto_position=1; -- 启动复制进程 start slave; show slave status \G; SELECT * FROM lhrdb.mytb1; -- 主库: insert into mytb1 values(3,'c'),(4,'d'); |
主从查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | -- 主库 show slave hosts; show master status; -- 从库 show slave status; -- 线程查询 SELECT * FROM performance_schema.threads a WHERE a.`NAME` IN ( 'thread/sql/slave_IO', 'thread/sql/slave_sql','thread/sql/slave_worker' ,'thread/sql/replica_io','thread/sql/replica_sql','thread/sql/replica_worker' ) or a.PROCESSLIST_COMMAND in ('Binlog Dump','Binlog Dump GTID') ; SELECT * FROM information_schema.`PROCESSLIST` a where a.USER='system user' or a.command in ('Binlog Dump','Binlog Dump GTID') ; |
MySQL 5.7实验过程
主库创建表
主库先创建一张8万行的大表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | -- 分批提交,关闭自动提交 DELIMITER $$ create procedure `t_pro`(num int) begin declare i int unsigned default 1; set autocommit=0; set i=1; while i <= num do insert into `t` (`id`,`name`) values(i,concat('主键测试',i)); set i=i + 1; if i%10000 = 0 then commit; end if; end while; set autocommit=1; end$$ DELIMITER ; call `t_pro`(80000); select count(*) from lhrdb.t; |
主库做更新操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | MySQL [lhrdb]> flush logs; Query OK, 0 rows affected (0.02 sec) MySQL [lhrdb]> show master logs; +-------------------+-----------+ | Log_name | File_size | +-------------------+-----------+ | master-bin.000001 | 177 | | master-bin.000002 | 3071539 | | master-bin.000003 | 623044079 | | master-bin.000004 | 633267 | | master-bin.000005 | 194 | +-------------------+-----------+ 5 rows in set (0.05 sec) MySQL [lhrdb]> update t set name=concat('主键测试66,结果验证66',t.id) where id <=20000; Query OK, 20000 rows affected (0.72 sec) Rows matched: 20000 Changed: 20000 Warnings: 0 |
可以看出,主库基本在1s就更新完成,变化的行数为2万行。
从库查询延迟,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 44 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 115 ... ... C:\Users\lhrxxt> C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 82 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 154 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 196 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 268 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 270 ... ... C:\Users\lhrxxt> C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 277 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 |
可以发现,最长延迟270秒左右,相当于5分钟左右。
分析主库的binlog日志
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | [root@docker35 ~]# docker exec -it master1 bash root@master:/# cd /var/lib/mysql root@master:/var/lib/mysql# mysqlbinlog master-bin.000005 --base64-output=decode-row -vv | grep UPDATE | wc -l 20000 root@master:/var/lib/mysql# mysqlbinlog master-bin.000005 --base64-output=decode-row -vv | more 。。。。。。。 ### UPDATE `lhrdb`.`t` ### WHERE ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ### @2='主键测试,结果验证1' /* VARSTRING(400) meta=400 nullable=1 is_null=0 */ ### SET ### @1=1 /* INT meta=0 nullable=1 is_null=0 */ ### @2='主键测试66,结果验证661' /* VARSTRING(400) meta=400 nullable=1 is_null=0 */ ### UPDATE `lhrdb`.`t` ### WHERE ### @1=2 /* INT meta=0 nullable=1 is_null=0 */ ### @2='主键测试,结果验证2' /* VARSTRING(400) meta=400 nullable=1 is_null=0 */ ### SET ### @1=2 /* INT meta=0 nullable=1 is_null=0 */ ### @2='主键测试66,结果验证662' /* VARSTRING(400) meta=400 nullable=1 is_null=0 */ 。。。。。。。 |
可以看出,在ROW模式下,在主库上执行了一条UPDATE语句,更新了2万行记录,但是在binlog中,记录了2万行的UPDATE语句。
分析从库的中继日志
1 2 3 4 5 | [root@docker35 ~]# docker exec -it slave1 bash root@slave1:/# cd /var/lib/mysql root@slave1:/var/lib/mysql# mysqlbinlog slave1-relay-bin.000011 --base64-output=decode-row -vv | grep UPDATE | wc -l 20000 |
可以看出,在从库上也是2万行的UPDATE语句,也是一条一条的进行更新。由于没有主键和索引,所以,就会导致在从库进行2万次的全表扫描,这样也就拖慢了从库APPLY的效率。
尝试添加并行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | -- 主库 set global binlog_group_commit_sync_delay=10; set global binlog_group_commit_sync_no_delay_count=10; show variables like 'binlog_group_commit_sync_%'; -- 从库 STOP SLAVE SQL_THREAD; set global slave_parallel_type=LOGICAL_CLOCK; set global slave_parallel_workers=16; show variables like 'slave_parallel%'; start SLAVE SQL_THREAD; -- 线程查询 SELECT * FROM performance_schema.threads a WHERE a.`NAME` IN ( 'thread/sql/slave_IO', 'thread/sql/slave_sql','thread/sql/slave_worker' ,'thread/sql/replica_io','thread/sql/replica_sql','thread/sql/replica_worker' ) or a.PROCESSLIST_COMMAND in ('Binlog Dump','Binlog Dump GTID') ; SELECT * FROM information_schema.`PROCESSLIST` a where a.USER='system user' or a.command in ('Binlog Dump','Binlog Dump GTID') ; -- 主库更新 update t set name=concat('主键测试99,结果验证99',t.id) where id <=20000; -- 从库查询延迟,发现延迟并没有很大的改善 mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 195 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 |
解决延迟:表添加主键
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | -- 主库执行,会自动同步到从库 MySQL [lhrdb]> alter table t add primary key(id); Query OK, 0 rows affected (1.05 sec) Records: 0 Duplicates: 0 Warnings: 0 MySQL [lhrdb]> desc t; +-------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+--------------+------+-----+---------+-------+ | id | int(11) | NO | PRI | NULL | | | name | varchar(100) | YES | | NULL | | +-------+--------------+------+-----+---------+-------+ 2 rows in set (0.05 sec) MySQL [lhrdb]> update t set name=concat('主键测试888,结果验证8888',t.id) where id <=30000; Query OK, 30000 rows affected (1.29 sec) Rows matched: 30000 Changed: 30000 Warnings: 0 -- 查询从库的延迟 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33651 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 3 C:\Users\lhrxxt>mysql -uroot -plhr -h192.168.1.35 -P33652 -e "show slave status \G" | grep Seconds_Behind_Master mysql: [Warning] Using a password on the command line interface can be insecure. Seconds_Behind_Master: 0 |
可以看到,在有主键的情况下,从库基本无延迟。
MySQL 8.0.27环境实验
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | ## 1、初始化环境 docker rm -f master2 slave2 rm -rf /lhrmysqltest3/master2/ rm -rf /lhrmysqltest3/slave2/ mkdir -p /lhrmysqltest3/master2/conf.d mkdir -p /lhrmysqltest3/slave2/conf.d cat > /lhrmysqltest3/master2/conf.d/my.cnf <<"EOF" [mysqld] port=3306 character_set_server=utf8mb4 secure_file_priv= server-id = 8033660 log-bin = binlog_format=row skip-name-resolve gtid-mode=ON enforce-gtid-consistency=on report_host=172.72.5.60 default-time-zone = '+8:00' log_timestamps = SYSTEM log_slave_updates = 1 master_info_repository='table' relay_log_info_repository='table' relay_log_recovery=1 default_authentication_plugin=mysql_native_password EOF cat > /lhrmysqltest3/slave2/conf.d/my.cnf <<"EOF" [mysqld] port=3306 character_set_server=utf8mb4 secure_file_priv= server-id = 8033661 log-bin = binlog_format=row gtid-mode=ON enforce-gtid-consistency=ON skip_name_resolve report_host=172.72.5.61 default-time-zone = '+8:00' log_timestamps = SYSTEM log_slave_updates = 1 master_info_repository='table' relay_log_info_repository='table' relay_log_recovery=1 default_authentication_plugin=mysql_native_password EOF docker run -d --name master2 \ -h master -p 33660:3306 --net=mysql-network --ip 172.72.5.60 \ -v /lhrmysqltest3/master2/conf.d:/etc/mysql/conf.d \ -e MYSQL_ROOT_PASSWORD=lhr \ mysql:8.0.27 docker run -d --name slave2 \ -h slave2 -p 33661:3306 --net=mysql-network --ip 172.72.5.61 \ -v /lhrmysqltest3/slave2/conf.d:/etc/mysql/conf.d \ -e MYSQL_ROOT_PASSWORD=lhr \ mysql:8.0.27 ## 2、主库配置 -- 主库创建复制用户repl mysql -uroot -plhr -h192.168.1.35 -P33660 create user repl@'%' identified with mysql_native_password by 'lhr'; grant all on *.* to repl@'%' with grant option; flush privileges; select user,host,grant_priv,password_last_changed,authentication_string from mysql.user; create database lhrdb; use lhrdb; create table t(id int,name varchar(30)); create table mytb1(id int,name varchar(30)); insert into mytb1 values(1,'a'),(2,'b'); ## 3、从库配置 mysql -uroot -plhr -h192.168.1.35 -P33661 change master to master_host='172.72.5.60', master_port=3306, master_user='repl', master_password='lhr', master_auto_position=1; -- 启动复制进程 start slave; show slave status \G; SELECT * FROM lhrdb.mytb1; -- 主库: insert into mytb1 values(3,'c'),(4,'d'); SELECT * FROM lhrdb.mytb1; -- 主库建表 DELIMITER $$ create procedure `t_pro`(num int) begin declare i int unsigned default 1; set autocommit=0; set i=1; while i <= num do insert into `t` (`id`,`name`) values(i,concat('主键测试',i)); set i=i + 1; if i%10000 = 0 then commit; end if; end while; set autocommit=1; end$$ DELIMITER ; call `t_pro`(80000); select count(*) from lhrdb.t; update t set name=concat('主键测试66,结果验证66',t.id) where id <=20000; mysql -uroot -plhr -h192.168.1.35 -P33661 -e "show slave status \G" | grep Seconds_Behind_Master -- 主库开并行 set global binlog_group_commit_sync_delay=10; set global binlog_group_commit_sync_no_delay_count=10; show variables like 'binlog_group_commit_sync_%'; -- 从库 STOP SLAVE SQL_THREAD; set global slave_parallel_type=LOGICAL_CLOCK; set global slave_parallel_workers=16; show variables like 'slave_parallel%'; start SLAVE SQL_THREAD; -- 线程查询 SELECT * FROM performance_schema.threads a WHERE a.`NAME` IN ( 'thread/sql/slave_IO', 'thread/sql/slave_sql','thread/sql/slave_worker' ,'thread/sql/replica_io','thread/sql/replica_sql','thread/sql/replica_worker' ) or a.PROCESSLIST_COMMAND in ('Binlog Dump','Binlog Dump GTID') ; SELECT * FROM information_schema.`PROCESSLIST` a where a.USER='system user' or a.command in ('Binlog Dump','Binlog Dump GTID') ; mysql -uroot -plhr -h192.168.1.35 -P33661 -e "show slave status \G" | grep Seconds_Behind_Master -- 主库增加主键 alter table t add primary key(id); |
可见,主库更新2万行数据,从库延迟不超过5秒,但若主库更新6万行,则从库延迟接近20秒。说明,在MySQL 8中,性能有所提升,但仍然需要主键。
总结
1、在MySQL 5.7的主从复制架构中,若存在大表,那么一定要有主键或唯一索引,否则将导致很大的主从延迟。从库即使添加并行复制,也不能改善这种情况。
2、从MySQL 8.0开始的主从复制架构中,若主库大表没有主键,仍然会导致从库的延迟,但是,延迟的现象没有5.7那么严重,所以,我们仍然建议主库的大表一定需要有主键。