k8s集群-CNI网络插件(Calico 和 Flannel)
2022-09-04 本文已影响0人
Chris0Yang
1)部署flannel网络(主节点服务器)
在主节点服务器上查看子节点状态为NotReady
[root@k8s-master01-15 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-15 NotReady master 20m v1.20.11
k8s-node01-16 NotReady <none> 19m v1.20.11
k8s-node02-17 NotReady <none> 19m v1.20.11
检查方法
执行命令kubectl get nodes 查询状态没有启动
查看日志journalctl -u kubelet
Unable to update cni config: no networks found in /etc/cni/net.d
出现这种报错的 是没有安装网络插件,可以往下部署flannel网络。或者等待一会状态会改为Ready
# 在master机器上执行
# 1、创建整理安装所需的文件夹
mkdir -p /data/script/kubernetes/install-k8s/core/ && cd /data/script/kubernetes/
# 2、将主要的文件放入文件夹中
mv /data/script/kubeadm-init.log /data/script/kubeadm-config.yaml /data/script/kubernetes/install-k8s/core/
# 3、创建flannel文件夹
cd /data/script/kubernetes/install-k8s/ && mkdir -p /data/kubernetes/install-k8s/plugin/flannel/ && cd /data/kubernetes/install-k8s/plugin/flannel/
下载kube-flannel.yml文件
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 下载命令的打印结果
--2021-07-01 18:10:44-- https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.108.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14366 (14K) [text/plain]
Saving to: ‘kube-flannel.yml’
kube-flannel.yml 100%[================================================>] 14.03K --.-KB/s in 0.05s
2021-07-01 18:15:00 (286 KB/s) - ‘kube-flannel.yml’ saved [14366/14366]
执行安装flannel网络插件
# 先拉取镜像,此过程国内速度比较慢
docker pull quay.io/coreos/flannel:v0.14.0
编辑 kube-flannel.yml 网卡的配置 (master机器)
vim kube-flannel.yml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.14.0
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.14.0
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0 # 如机器有多个网卡的话,指定内网网卡的名称,默认不指定的话会找第一块网卡
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
# 创建flannel
kubectl create -f kube-flannel.yml
# 创建命令的打印结果
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
# 查看pod, 可以看到flannel组件已经运行起来了. 默认系统组件都安装在 kube-system 这个命名空间(namespace)下
[root@k8s-master01-15 flannel]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-tlqdw 1/1 Running 0 18m
coredns-66bff467f8-zpg4q 1/1 Running 0 18m
etcd-k8s-master01-15 1/1 Running 0 18m
kube-apiserver-k8s-master01-15 1/1 Running 0 18m
kube-controller-manager-k8s-master01-15 1/1 Running 0 18m
kube-flannel-ds-6lbmw 1/1 Running 15 (5m30s ago) 59m
kube-flannel-ds-97mkh 0/1 CrashLoopBackOff 14 (4m58s ago) 59m
kube-flannel-ds-fthvm 0/1 Running 15 (5m26s ago) 59m
kube-proxy-4jj7b 0/1 CrashLoopBackOff 0 4m9s
kube-proxy-ksltf 0/1 CrashLoopBackOff 0 4m9s
kube-proxy-w8dcr 0/1 CrashLoopBackOff 0 4m9s
kube-scheduler-k8s-master01-15 1/1 Running 0 18m
# 再次查看node, 发现状态已经变成了 Ready
[root@k8s-master01-15 flannel]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-15 Ready master 19m v1.20.11
这个如何卸载 flannel.yml (delete )
[root@k8s-master01-15 flannel]# kubectl delete -f kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy "psp.flannel.unprivileged" deleted
clusterrole.rbac.authorization.k8s.io "flannel" deleted
clusterrolebinding.rbac.authorization.k8s.io "flannel" deleted
serviceaccount "flannel" deleted
configmap "kube-flannel-cfg" deleted
daemonset.apps "kube-flannel-ds" deleted
2)部署Calico网络插件
Calico遇到的哪些问题,并如何处理呢?
https://www.jianshu.com/p/8b4c3ac2db6f
从Calico官网下载资源文件:
wget https://docs.projectcalico.org/manifests/calico-etcd.yaml
下面只列出修改的部分:
$ vim calico-etcd.yaml
...
# 这里反引号包裹的内容表示需要执行它将其结果替换到此处
# kubeadm目录路径:/opt/etcd/ssl/server-key.pem、ca.pem、server-key.pem
# 二进制目录路径:/opt/k8s_tls/etcd/server-key.pem、server.pem、ca.pem
# etcd 证书私钥
etcd-key: #需要执行 cat /opt/k8s_tls/etcd/server-key.pem | base64 -w 0 的命令
# etcd 证书
etcd-cert: #需要执行 cat /opt/k8s_tls/etcd/server.pem | base64 -w 0 的命令
# etcd CA 证书
etcd-ca: #需要执行 cat /opt/k8s_tls/etcd/ca.pem | base64 -w 0 的命令
...
# 需要-部署外置的ETCD的集群:https://www.jianshu.com/p/fbec19c20454
# ETCD集群的地址
etcd_endpoints: "https://172.23.199.15:2379,https://172.23.199.16:2379,https://172.23.199.17:2379"
etcd_ca: "/calico-secrets/etcd-ca"
etcd_cert: "/calico-secrets/etcd-cert"
etcd_key: "/calico-secrets/etcd-key"
...
- name: CLUSTER_TYPE
value: "k8s,bgp"
# 新增部分
- name: IP_AUTODETECTION_METHOD
value: "interface=eth0"
# value: "interface=eth.*"
# value: "interface=can-reach=www.baidu.com"
# 新增部分结束
- name: IP
value: "autodetect"
# 禁止使用 IPIP 模式
- name: CALICO_IPV4POOL_IPIP
value: "Never"
# 设置 Pod IP 地址段,此处 value 应该与之前配置的 hosts.yaml 中的 pod_net 变量值一致
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
...
# 修改 cni 插件二进制文件映射到宿主机的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin #/opt/apps/cni/bin
# 修改 cni 配置目录为手动指定的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d #/opt/apps/cni/conf
# 修改 cni 日志目录为手动指定的目录,此处 /opt/apps 与 hosts.yaml 中的 install_dir 变量值一致
- name: cni-log-dir
hostPath:
path: /var/log/calico/cni #/opt/apps/cni/log
# 修改此卷的挂载权限为 0440,有两处
- name: etcd-certs
secret:
secretName: calico-etcd-secrets
defaultMode: 0440
应用修改好的资源文件:
kubectl apply -f calico-etcd.yaml
secret/calico-etcd-secrets created
configmap/calico-config created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
poddisruptionbudget.policy/calico-kube-controllers created
它会在 kube-system 命名空间下启动如下 Pod:
[root@k8s-master kubeadm]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-9d65bcc55-k2d8j 1/1 Running 0 18h
calico-node-7jbk2 1/1 Running 0 18h
calico-node-ffbwh 1/1 Running 0 18h
calico-node-rl4dw 1/1 Running 0 18h
部署Calicoctl 插件的流程并使用
1)环境信息
k8s集群为1.20.11版本,calico版本为3.20.0版本
2)下载calicoctl二进制文件
wget https://github.com/projectcalico/calicoctl/releases/download/v3.20.0/calicoctl
cp calicoctl /usr/bin
chmod +x /usr/bin/calicoctl
3)命令行测试
[root@k8s-master-15 ~]# DATASTORE_TYPE=kubernetes KUBECONFIG=~/.kube/config calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+------------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+------------+-------------+
| 172.23.199.16| node-to-node mesh | up | 20xx-xx-xx | Established |
| 172.23.199.17| node-to-node mesh | up | 20xx-xx-xx | Established |
+--------------+-------------------+-------+------------+-------------+
4)配置文件测试
#1、编辑配置文件
[root@k8s-master-15 ~]# mkdir -p /etc/calico/
[root@k8s-master-15 ~]# vim /etc/calico/calicoctl.cfg
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: "kubernetes"
kubeconfig: "~/.kube/config"
#2、测试命令如下
[root@k8s-master-15 ~]# calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+------------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+------------+-------------+
| 172.23.199.16| node-to-node mesh | up | 20xx-xx-xx | Established |
| 172.23.199.17| node-to-node mesh | up | 20xx-xx-xx | Established |
+--------------+-------------------+-------+------------+-------------+
说明:node-node mesh: 代表所有节点用full mesh的bgp连接
[root@k8s-master ~]# netstat -anp | grep ESTABLISH | grep bird
tcp 0 0 172.23.199.16:179 172.23.199.16:46090 ESTABLISHED 8918/bird
tcp 0 0 172.23.199.17:170 172.23.199.17:49770 ESTABLISHED 8918/bird
我部署的calico由于集群规模不是很大,使用的是calico的bgp模式的node-to-node-mesh全节点互联,这种模式在小规模集群里面还可以用。在3.4.0版本的calico是可以支持到100多个节点。