node not ready时如何迁移有状态pod
2024-01-12 本文已影响0人
wwq2020
背景
由于pv attach到对应的节点,就算是我们删除了pod,pod也无法调度到其他节点上(如果pvc的volume.kubernetes.io/selected-node有值,调度器根据它过滤节点,也就是只有这个注解对应的值的节点能满足)
这个时候就需要利用controller manager的自动detach能力
总结
1 删除pod
2 添加污点自动detach pv
···
kubectl taint node ${yournodename} node.kubernetes.io/out-of-service=nodeshutdown:NoExecute
···
3 自动调度到其他节点
源码
pkg/controller/volume/attachdetach/reconciler/reconciler.go中
func (rc *reconciler) reconcile(ctx context.Context) {
...
是否超时
timeout := elapsedTime > rc.maxWaitForUnmountDuration
节点是否健康
isHealthy, err := rc.nodeIsHealthy(attachedVolume.NodeName)
if err != nil {
...
}
...
是否强制detach
forceDetach := !isHealthy && timeout
是否有对应污点
hasOutOfServiceTaint, err := rc.hasOutOfServiceTaint(attachedVolume.NodeName)
if err != nil {
logger.Error(err, "Failed to get taint specs for node", "node", klog.KRef("", string(attachedVolume.NodeName)))
}
...
判断是否需要detach
if attachedVolume.MountedByNode && !forceDetach && !hasOutOfServiceTaint {
logger.V(5).Info("Cannot detach volume because it is still mounted", "node", klog.KRef("", string(attachedVolume.NodeName)), "volumeName", attachedVolume.VolumeName)
continue
}
...
执行detach
err = rc.attacherDetacher.DetachVolume(logger, attachedVolume.AttachedVolume, verifySafeToDetach, rc.actualStateOfWorld)
...
}
判断是否有node.kubernetes.io/out-of-service这个污点
func (rc *reconciler) hasOutOfServiceTaint(nodeName types.NodeName) (bool, error) {
node, err := rc.nodeLister.Get(string(nodeName))
if err != nil {
return false, err
}
return taints.TaintKeyExists(node.Spec.Taints, v1.TaintNodeOutOfService), nil
}
pkg/volume/util/operationexecutor/operation_executor.go中
detach volume
func (oe *operationExecutor) DetachVolume(
logger klog.Logger,
volumeToDetach AttachedVolume,
verifySafeToDetach bool,
actualStateOfWorld ActualStateOfWorldAttacherUpdater) error {
构建操作对象
generatedOperations, err :=
oe.operationGenerator.GenerateDetachVolumeFunc(logger, volumeToDetach, verifySafeToDetach, actualStateOfWorld)
if err != nil {
return err
}
if util.IsMultiAttachAllowed(volumeToDetach.VolumeSpec) {
return oe.pendingOperations.Run(
volumeToDetach.VolumeName, "" /* podName */, volumeToDetach.NodeName, generatedOperations)
}
执行操作
return oe.pendingOperations.Run(
volumeToDetach.VolumeName, "" /* podName */, "" /* nodeName */, generatedOperations)
}
pkg/volume/util/operationexecutor/operation_generator.go中
func (og *operationGenerator) GenerateDetachVolumeFunc(
...
构建detach方法
detachVolumeFunc := func() volumetypes.OperationContext {
var err error
if verifySafeToDetach {
err = og.verifyVolumeIsSafeToDetach(volumeToDetach)
}
if err == nil {
err = volumeDetacher.Detach(volumeName, volumeToDetach.NodeName)
}
...
}
...
return volumetypes.GeneratedOperations{
OperationName: DetachOperationName,
OperationFunc: detachVolumeFunc,
CompleteFunc: util.OperationCompleteHook(util.GetFullQualifiedPluginNameForVolume(pluginName, volumeToDetach.VolumeSpec), DetachOperationName),
EventRecorderFunc: nil, // nil because we do not want to generate event on error
}, nil
}
pkg/controller/volume/attachdetach/attach_detach_controller.go中
var DefaultTimerConfig = TimerConfig{
...
ReconcilerMaxWaitForUnmountDuration: 6 * time.Minute,
...
}
补充
对于需要维护的节点,主动进行删除pod,taint node操作
对于运行期not ready节点,部署巡检服务,进行删除pod和taint node操作