内容纲要
概要描述
本文描述pod启动卡在containercreating阶段,且kubectl describe pod -n xxxx 中pod event报错MountVolume.SetUp failed for volume "xxx": rpc error: code = FailedPrecondition desc = bindMountVolume: staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/pv/xxx/globalmount is not mounted yet 的排查思路以及解决方案
详细描述
问题描述
pod启动卡在containercreating阶段,且kubectl describe pod -n xxxx 中pod event报错
MountVolume.SetUp failed for volume "xxx": rpc error: code = FailedPrecondition desc = bindMountVolume: staging target path /var/lib/kubelet/plugins/kubernetes.io/csi/pv/xxx/globalmount is not mounted yet
排查思路
存储卷设备无法mount到pv到挂载点,导致无法挂载到容器内,可能的原因有:
1 warpdrive 相关问题
-
查存储卷所在节点warpdrive-operator的pod日志,是否有报错 (日志用存储卷名称过滤)
kubectl get po -n kube-system -o wide | grep warpdrive-operator
kubectl logs -n kube-system warpdrive-operator-xxxx
-
检查存储卷所在节点 warpdrive-engine 的日志,是否有报错(日志用存储卷名称过滤)
journalctl -xu warpdrive
-
尝试重启 warpdrive 和 不正常的pod
systemctl restart warpdrive
重启 pod 后,观察一段时间看是否能正常挂载
-
尝试手动挂载看是否报错
如:mount /dev/mapper/warpdrive-a7fb5848-3a71-462c-813e-9c232e8e96b4 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/warpdrive-a7fb5848-3a71-462c-813e-9c232e8e96b4/globalmount
2 存储卷设备本身问题
-
挂载有报错structure needs cleaning
尝试修复文件系统xfs_repair -L /dev/mapper/warpdrive-a7fb5848-3a71-462c-813e-9c232e8e96b4
再去尝试手动挂载
mount /dev/mapper/warpdrive-a7fb5848-3a71-462c-813e-9c232e8e96b4 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/warpdrive-a7fb5848-3a71-462c-813e-9c232e8e96b4/globalmount
然后重启pod
-
检查event中报错的相关存储卷是否正常
kubectl describe wv warpdrive-xxx
-
在存储卷所在节点,检查存储池所在磁盘是否正常
dmesg --level err
检查是否有磁盘相关报错,如果有的话,就是相关磁盘的报错