kundb relay-log损坏

2024-05-20 其他常见问题

内容纲要

概要描述

本文描述Kundb relay-log文件不正常的排查思路以及解决方案

详细描述

常见于集群出现意外断电等情况，导致kundb服务不正常

确认kundb角色状态

链接kundb角色执行

select * from performance_schema.replication_group_members;

如下图所示发现只有一个primary可用，其余节点均处于recovery状态，且过一段时间会退出。
file

查看日志

查看2个问题节点的日志 /mnt/disk1/kundb11/kundbdata/error.log 均发现以下报错

[ERROR] [MY-013122] [Repl] Slave SQL for channel 'group_replication_applier': Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, the server was unable to fetch a keyring key required to open an encrypted relay log file, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: MY-013122

关键词：the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted

如果存在该错表明relaylog在断电时文件发生了损坏，需要清理掉错误日志。

清理方式

在有问题的节点上式将relay-log目录mv掉并新建一个空目录
进入kundb的pod内

 ps -ef|grep mysqld

根据进程出来的路径，比如

file

在对应的pod内执行：

cd /vdir/mnt/disk1/kundb11/kundbdata/
mv relay-logs relay-logs-bak
mkdir relay-logs

退出pod ，kubectl delete pod xxxx 对问题pod进行重启

重启之后链接kundb再次进行验证

select * from performance_schema.replication_group_members;