Node Affinity 节点亲和性解读
NodeAffinity 节点亲和性作为 k8s 一种高级调度策略,其设计目标“决定 Pod 可以部署在哪些主机上”,相比nodeselector 支持更丰富的操作符如 In,NotIn,Exists,DoesNotExist,Gt,Lt 来满足用户的可操控性需求,在日常的应用环境中使用还是比较常见,比如:应用环境灾备A/B区部署;应用部署在 GPU 资源需求的节点等。对于操作人员来讲可能仅了解 yaml 如何配置,可能不太了解一下定义其代码的意义细节,如操作 Gt/Lt 大于小于操作运算是如何比较大小?多条 Match 规则是如何(优先级)选择?等等,我们今天就从代码上来针对这些问题解解惑,相信看到代码逻辑后让将对 NodeAffinity 亲和有更深入的理解。
我们先从常见的几个 Pod 定义 NodeAffinity 亲和实例开始,熟悉一下 NodeAffinity 配置定义
# 实例一(matchExpressions) 实现目标:多区域部署应用
---
apiVersion:v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity: #pod实例部署在az1 或 az2
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name #node标签可自定义匹配
operator: In
values:
- e2e-az1
- e2e-az2
# 实例二(matchFields) 实现目标: 排除指定节点部署应用
---
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name #Node name
operator: NotIn
values:
- work-node-abc
kubernetes V1.18 代码
Scheduler 组件的主流程非本文关注的内容,我们仅聚焦在 node_affinity 实现代码 PodMatchesNodeSelectorAndAffinityTerms() 主方法,返回是否满足匹配亲和的结果(True/ false);这里几点需要说明一下:1) 如同时定义了 NodeSelector 和 NodeAffinity 则需要同时满足; 2) nodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution 未定义则直接返回 True;3) 最终亲和匹配逻辑实际调用 v1helper.MatchNodeSelectorTerms() 方法
pkg/scheduler/framework/plugins/helper/node_affinity.go:28
func PodMatchesNodeSelectorAndAffinityTerms(pod *v1.Pod, node *v1.Node) bool {
// 检测是否定义了NodeSelector 'pod.Spec.NodeSelector'
if len(pod.Spec.NodeSelector) > 0 {
selector := labels.SelectorFromSet(pod.Spec.NodeSelector)
if !selector.Matches(labels.Set(node.Labels)) {
return false
}
}
nodeAffinityMatches := true
affinity := pod.Spec.Affinity
if affinity != nil && affinity.NodeAffinity != nil {
nodeAffinity := affinity.NodeAffinity
if nodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution == nil {
return true
}
// 检测定义 nodeAffinity.Required..,
// node节点亲和性 nodeMatchesNodeSelectorTerms() 返回是否满足亲和匹配
if nodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution != nil {
nodeSelectorTerms := nodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms
nodeAffinityMatches = nodeAffinityMatches && nodeMatchesNodeSelectorTerms(node, nodeSelectorTerms)
}
}
return nodeAffinityMatches
}
// 内部实际调用 v1helper.MatchNodeSelectorTerms()
func nodeMatchesNodeSelectorTerms(node *v1.Node, nodeSelectorTerms []v1.NodeSelectorTerm) bool {
return v1helper.MatchNodeSelectorTerms(nodeSelectorTerms, node.Labels, fields.Set{
"metadata.name": node.Name, // Fields.Set指定"metadata.name"为node节点名
})
}
MatchNodeSelectorTerms() 遍历所有 nodeSelectorTerms 项定义,对每项内配置存在的 MatchExpressions 和 MatchFields 定义进行相应的处理。
-
MatchExpressions 对应 NodeSelectorRequirementsAsSelector() 返回 labelSelector 对象Matches()
-
MatchFields 对应 NodeSelectorRequirementsAsFieldSelector() 返回 fieldSelector 对象Matches()
从这里可以看出如果 nodeSelectorTerms 存在多项,则是满足一项则返回 True ,则 或
关系
pkg/apis/core/v1/helper/helpers.go:314
// MatchNodeSelectorTerms checks whether the node labels and fields match node selector terms in ORed;
// nil or empty term matches no objects.
func MatchNodeSelectorTerms(
nodeSelectorTerms []v1.NodeSelectorTerm,
nodeLabels labels.Set,
nodeFields fields.Set,
) bool {
//多个nodeSelectorTerms,满足一个则返回True
for _, req := range nodeSelectorTerms {
// nil or empty term selects no objects
if len(req.MatchExpressions) == 0 && len(req.MatchFields) == 0 {
continue
}
if len(req.MatchExpressions) != 0 {
labelSelector, err := NodeSelectorRequirementsAsSelector(req.MatchExpressions)
if err != nil || !labelSelector.Matches(nodeLabels) {
continue
}
}
if len(req.MatchFields) != 0 {
fieldSelector, err := NodeSelectorRequirementsAsFieldSelector(req.MatchFields)
if err != nil || !fieldSelector.Matches(nodeFields) {
continue
}
}
return true
}
return false
}
NodeSelectorRequirementsAsSelector() MatchExpressions请求操作定义解析后返回 labels Selector 选择器对象,再调用选择器对象的 Matches() 方法。 这里 labels.NewSelector() 所构建的 internalSelector 对象,后面详细介绍。
pkg/apis/core/v1/helper/helpers.go:234
func NodeSelectorRequirementsAsSelector(nsm []v1.NodeSelectorRequirement) (labels.Selector, error) {
if len(nsm) == 0 {
return labels.Nothing(), nil
}
// 创建一个选择器实现 internalSelector
selector := labels.NewSelector()
for _, expr := range nsm {
var op selection.Operator
switch expr.Operator {
case v1.NodeSelectorOpIn:
op = selection.In
case v1.NodeSelectorOpNotIn:
op = selection.NotIn
case v1.NodeSelectorOpExists:
op = selection.Exists
case v1.NodeSelectorOpDoesNotExist:
op = selection.DoesNotExist
case v1.NodeSelectorOpGt:
op = selection.GreaterThan
case v1.NodeSelectorOpLt:
op = selection.LessThan
default:
return nil, fmt.Errorf("%q is not a valid node selector operator", expr.Operator)
}
r, err := labels.NewRequirement(expr.Key, op, expr.Values)
if err != nil {
return nil, err
}
// 为选择器增加 Requirement ,当定义多条请求表达式时'添加多条选择请求'
selector = selector.Add(*r)
}
return selector, nil
}
先看另一个实列 matchExpressions 多操作请求的配置Yaml ,两个操作请求选择代码逻辑是怎么样的?
# 实例三: matchExpressions 多操作请求
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name # req 1
operator: In
values:
- e2e-az1
- e2e-az2
- key: cpu-type # req 2
operator: In
values:
- gpu
labels.NewSelector() 创建的 labelSelector 标签选择器 internalSelector
,查看代码定义自身是 []Requirement 类型,Add() 方法则是将一条 Req 添加到slice列表内, Matches()方法则是遍历列表的 Req 并调用 req.Matches()方法,如存在多条 Req 则必须所有都满足条件才返回 True,则相当于与
运算逻辑, 如上面所配置的实例三多操作请求。
// NewSelector returns a nil selector
func NewSelector() Selector {
return internalSelector(nil)
}
type internalSelector []Requirement
// Add adds requirements to the selector. It copies the current selector returning a new one
func (lsel internalSelector) Add(reqs ...Requirement) Selector {
var sel internalSelector
for ix := range lsel {
sel = append(sel, lsel[ix])
}
for _, r := range reqs {
sel = append(sel, r)
}
sort.Sort(ByKey(sel))
return sel
}
func (lsel internalSelector) Matches(l Labels) bool {
for ix := range lsel {
if matches := lsel[ix].Matches(l); !matches {
return false
}
}
return true
}
再来看一下 Requirement.Matches() 操作符的计算逻辑:
In/NotIn
运算先匹配标签key是否存在,再匹配标签value是否存在,两者都字符串的等运算;
Exists/DoesNotExist
仅运算先匹配标签key是否存在,不做标签的值的匹配;
GreaterThan/LessThan
大小比较算法是将字符转换成数值后进行比较.
staging/src/k8s.io/apimachinery/pkg/labels/selector.go:198
func (r *Requirement) Matches(ls Labels) bool {
switch r.operator {
// In 、 = 、 ==
case selection.In, selection.Equals, selection.DoubleEquals:
if !ls.Has(r.key) {
return false
}
return r.hasValue(ls.Get(r.key)) // Value[]值列表内是否包含此Value字符串;字符串完全匹配
// NotIn 、 !=
case selection.NotIn, selection.NotEquals:
if !ls.Has(r.key) {
return true
}
return !r.hasValue(ls.Get(r.key))
// Exsits
case selection.Exists: // 列表Key是否存在
return ls.Has(r.key)
// DoesNotExist
case selection.DoesNotExist:
return !ls.Has(r.key)
// Gt 、 Lt
case selection.GreaterThan, selection.LessThan: //大小比较算法是将字符转换成数值后进行比较
if !ls.Has(r.key) {
return false
}
lsValue, err := strconv.ParseInt(ls.Get(r.key), 10, 64)
if err != nil {
klog.V(10).Infof("ParseInt failed for value %+v in label %+v, %+v", ls.Get(r.key), ls, err)
return false
}
if len(r.strValues) != 1 {
klog.V(10).Infof("Invalid values count %+v of requirement %#v, for 'Gt', 'Lt' operators, exactly one value is required", len(r.strValues), r)
return false
}
var rValue int64
for i := range r.strValues {
//字符转数值后进行比较
rValue, err = strconv.ParseInt(r.strValues[i], 10, 64)
if err != nil {
klog.V(10).Infof("ParseInt failed for value %+v in requirement %#v, for 'Gt', 'Lt' operators, the value must be an integer", r.strValues[i], r)
return false
}
}
return (r.operator == selection.GreaterThan && lsValue > rValue) || (r.operator == selection.LessThan && lsValue < rValue)
default:
return false
}
}
NodeSelectorRequirementsAsFieldSelector() Field 请求操作定义解析后返回 fieldsSelector
pkg/apis/core/v1/helper/helpers.go:268
func NodeSelectorRequirementsAsFieldSelector(nsm []v1.NodeSelectorRequirement) (fields.Selector, error) {
if len(nsm) == 0 {
return fields.Nothing(), nil
}
selectors := []fields.Selector{}
for _, expr := range nsm {
switch expr.Operator {
case v1.NodeSelectorOpIn:
if len(expr.Values) != 1 { // vlaues 定义的长度仅为1
return nil, fmt.Errorf("unexpected number of value (%d) for node field selector operator %q",
len(expr.Values), expr.Operator)
}
//选择器 hasTerm
selectors = append(selectors, fields.OneTermEqualSelector(expr.Key, expr.Values[0]))
case v1.NodeSelectorOpNotIn:
if len(expr.Values) != 1 {
return nil, fmt.Errorf("unexpected number of value (%d) for node field selector operator %q",
len(expr.Values), expr.Operator)
}
//选择器 notHasTerm
selectors = append(selectors, fields.OneTermNotEqualSelector(expr.Key, expr.Values[0]))
default:
return nil, fmt.Errorf("%q is not a valid node field selector operator", expr.Operator)
}
}
// 多条fieldMatch则是与运算
return fields.AndSelectors(selectors...), nil
}
fieldsSelector.Matches () 字段选择器 Matches 操作符的计算逻辑是:亲和fields表达式定义key字段与vaule是否匹配 fields.Set{"metadata.name": node.Name}集合的值。
从代码来看支持的字段是 "metadata.name" ,也就是node.Name值的比较
//v1helper.MatchNodeSelectorTerms(nodeSelectorTerms, node.Labels, fields.Set{
// "metadata.name": node.Name,
//})
func (t *hasTerm) Matches(ls Fields) bool {
return ls.Get(t.field) == t.value
}
func (t *notHasTerm) Matches(ls Fields) bool {
return ls.Get(t.field) != t.value
}
看完 node 亲和性代码逻辑后,最后我们来回答以下几个问题?
- 当nodeSelectorTerms 内同时存在 matchExpressions 和 MatchFields 时,如何选择?答:
与
关系
-
matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
- key: cpu-type
operator: In
values:
- gpu
-matchFields:
- key: metadata.name
operator: NotIn
values:
- work-node-abc
- 当 matchExpressions 或 MatchFields 内存在多条 Match 操作请求表达式时,如何选择? 答:
与
关系
- matchExpressions:
-key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
-key: cpu-type
operator: In
values:
- gpu
- 当 一条 Match 操作请求表达式内存在 多条 Values 时,如何选择? 答:
或
关系
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
-e2e-az1
-e2e-az2
- 操作符 In,NotIn,Exists,DoesNotExist,Gt,Lt 代表什么操作逻辑?
In/NotIn
运算先匹配标签key是否存在,再匹配标签value是否存在,两者都字符串的等运算;
Exists/DoesNotExist
仅运算先匹配标签key是否存在,不做标签的值的匹配;
GreaterThan/LessThan
大小比较算法是将字符转换成数值后进行比较.