K8S-[一]Pod调度
Pod中影响调度的主要属性
image.pngimage.png
一.调度依据
resources--资源限制
resources分为
-
容器资源限制(该资源最大占用多少节点多少)
resources.limits.cpu
resources.limits.memory -
容器资源依据(该节点最小资源需求)
resources.requests.cpu
resources.requests.memory
配置实例:(被控制器位置spec下containers 下一级)
resources:
requests: --容器资源依据,node节点最小要具有128M的内存,250m的cpu
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi" -- 容器资源限制,这个容器资源最多使用256M的内存,500mcpu。不能超过此数值
cpu: "500m"
完整pod.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
app_name: resource-pod
name: pod-resource-test
namespace: default
spec:
containers:
- image: nginx:1.18
name: web
volumeMounts:
- name: log
mountPath: /data
ports:
- name: web
containerPort: 80
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
volumes:
- name: log
emptyDir: {}
查看结果:
运行pod,并查看
[root@k8s-master pod_yml]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-resource-test 1/1 Running 0 5m42s 10.244.169.159 k8s-node2 <none> <none>
[root@k8s-master pod_yml]# kubectl describe node k8s-node2
Non-terminated Pods: (8 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
default pod-resource-test 250m (6%) 500m (12%) 128Mi (3%) 256Mi (6%) 6m34s
kube-system calico-node-t2qh4 250m (6%) 0 (0%) 0 (0%) 0 (0%) 6d3h
kube-system kube-proxy-hqvkk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6d3h
kubernetes-dashboard kubernetes-dashboard-5dbf55bd9d-hblj7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6d2h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (12%) 500m (12%)
memory 128Mi (3%) 256Mi (6%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
调度策略
nodeSelector(强制) & nodeAffinity(亲和)
nodeSelector:
用于将Pod调度到匹配Label的Node上,如果没有匹配的标签会调度失败。
作用:
• 约束Pod到特定的节点运行
• 完全匹配节点标签
应用场景:
• 专用节点:根据业务线将Node分组管理
• 配备特殊硬件:部分Node配有SSD硬盘、GPU
特点:
具有强制性,明确性。哪个node上有指定label标签,分配到哪儿去,否则就会调度失败,pod状态一直pending。
配置例子:
将pod分配到有ssd硬盘的节点上。
1.先给节点打标签,如node1上是ssd
打标签
kubectl label nodes k8s-node1 disk=SSD
查看
kubectl get nodes --show-label -l disk=SSD
·2.添加nodeSelector字段到Pod配置中
添加:
nodeSelector:
disk: SSD
apiVersion: v1
kind: Pod
metadata:
labels:
app_name: selector-pod
name: pod-selector-test
namespace: default
spec:
nodeSelector:
disk: SSD
containers:
- image: nginx:1.18
name: web
- 验证
[root@k8s-master pod_yml] kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod-selector-test 1/1 Running 0 4m24s 10.244.36.86 k8s-node1 <none> <none>
nodeAffinity(亲和)
节点亲和性,与nodeSelector作用一样,但相比更灵活,满足更多条件。
调度分为软策略和硬策略,而不是硬性要求
• 硬(required):必须满足
• 软(preferred):尝试满足,但不保证
操作符:In(常用)、NotIn、Exists、DoesNotExist、Gt、Lt
配置: 策略配置在spec字段下
硬策略 required
affinity:
nodeAffinity:
<硬策略>
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: <labelname>
operator: In
values:
- <labelvalue>
软策略preferred
affinity:
nodeAffinity:
<软策略>
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: <labelname>
operator: 操作符(In,NotIn ....)
values:
- <labelvalue>
配置实例:
硬策略required--给pod分配节点时必须满足node上带有标签是gpu = nvidia-tesla 才能分配。否则不调度
软策略preferred 给pod分配节点时 尽量满足node上标签是group=ai的,不满足的话就按照selector别的调度算法继续调度到一个node
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu
operator: In
values:
- nvidia-tesla
#<软策略>
# preferredDuringSchedulingIgnoredDuringExecution:
# - weight: 1
# preference:
# matchExpressions:
# - key: group
# operator: In
# values:
# - ai
containers:
- image: nginx:1.18
name: web
ports:
- name: web
containerPort: 80
此时配置的是硬策略。效果和nodeselector是一样的,强制性。没有就不调度
先看下,所有node都没有这个标签
kubectl get node --show-labels -l gpu=nvidia-tesla
#No resources found
创建Pod ,在查看状态,因为没找到匹配label的Node所以无法调度,会一直pending,证明了required的强制性。
kubectl get pod
#NAME READY STATUS RESTARTS AGE
#pod-affinity-test 0/1 Pending 0 5s
kubectl describe pods pod-affinity-test
#Events:
# Type Reason Age From Message
# ---- ------ ---- ---- -------
# Warning FailedScheduling 85s 0/3 nodes are available: 3 node(s) didn't match node selector.
# Warning FailedScheduling 85s 0/3 nodes are available: 3 node(s) didn't match node selector.
现在来看下软策略,去掉之前软策略的注释。 把硬策略删除。
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: group
operator: In
values:
- ai
查看当前集群有没有node上有group=ai这个标签
kubectl get nodes --show-labels -l group=ai
#No resources found
仍然没有,现在创建pod,再看下状态已经被分配了,证明了preferred软策略的亲和性,可容忍性。
[root@k8s-master pod_yml]# kubectl get pod -o wide
#NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE #READINESS GATES
#pod-affinity-test 1/1 Running 0 17s 10.244.169.168 k8s-node2 <none> <none>
kubectl describe pods pod-affinity-test
#Events:
# Type Reason Age From Message
# ---- ------ ---- ---- -------
# Normal Scheduled 88s Successfully assigned default/pod-affinity-test to k8s-node2
# Normal Pulled 88s kubelet, k8s-node2 Container image "nginx:1.18" already present on machine
# Normal Created 88s kubelet, k8s-node2 Created container web
# Normal Started 88s kubelet, k8s-node2 Started container web