Metallb调试分析

2020-05-30  本文已影响0人  huiwq1990

零 前言

image

一 环境信息

集群节点

[root@master ~]# kubectl  get node -o wide
NAME     STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION           CONTAINER-RUNTIME
master   Ready    master   19h   v1.17.5   192.168.26.10   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.9
node1    Ready    <none>   18h   v1.17.5   192.168.26.11   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.9
node2    Ready    <none>   18h   v1.17.5   192.168.26.12   <none>        CentOS Linux 7 (Core)   3.10.0-1127.el7.x86_64   docker://19.3.9
[root@master ~]#

POD部署

[root@master ~]# kubectl  get pod -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP               NODE     NOMINATED NODE   READINESS GATES
metallb-controller-75bf779d4f-585mp   1/1     Running   0          14s   10.244.104.8     node2    <none>           <none>
metallb-speaker-4cnnj                 1/1     Running   0          14s   192.168.26.12    node2    <none>           <none>
metallb-speaker-kkd5n                 1/1     Running   0          14s   192.168.26.11    node1    <none>           <none>
metallb-speaker-w8bs4                 1/1     Running   0          14s   192.168.26.10    master   <none>           <none>
my-nginx-f97c96f6d-dfnj9              1/1     Running   0          27s   10.244.166.131   node1    <none>           <none>

测试LB服务

my-nginx为LoadBalancer类型的服务,分配的IP为主机网段192.168.26.190

[root@master ~]# kubectl  get svc
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)        AGE
kubernetes   ClusterIP      10.96.0.1      <none>           443/TCP        19h
my-nginx     LoadBalancer   10.101.85.30   192.168.26.190   80:32366/TCP   17s

二 节点信息

Master节点

[root@master ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:0e:4e:dd brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 85378sec preferred_lft 85378sec
    inet6 fe80::8fb:7623:d2f6:25e4/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 10:00:00:00:00:a0 brd ff:ff:ff:ff:ff:ff
    inet 192.168.26.10/24 brd 192.168.26.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::1200:ff:fe00:a0/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:df:3f:fc:54 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 42:d5:d5:cc:8d:d7 brd ff:ff:ff:ff:ff:ff
6: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 72:f1:c0:46:00:c2 brd ff:ff:ff:ff:ff:ff
    inet 10.96.0.10/32 brd 10.96.0.10 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.96.0.1/32 brd 10.96.0.1 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.101.139.35/32 brd 10.101.139.35 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 192.168.26.190/32 brd 192.168.26.190 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
7: calicebcde35cc6@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
       valid_lft forever preferred_lft forever

Node1节点

[root@node1 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:0e:4e:dd brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 85306sec preferred_lft 85306sec
    inet6 fe80::7b7f:9e4b:166d:56cf/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 10:00:00:00:00:b1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.26.11/24 brd 192.168.26.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::1200:ff:fe00:b1/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:a9:ab:b7:d8 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:d4:68:89:70:4f brd ff:ff:ff:ff:ff:ff
6: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 0a:ed:a7:28:3c:5a brd ff:ff:ff:ff:ff:ff
    inet 10.96.0.10/32 brd 10.96.0.10 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.96.0.1/32 brd 10.96.0.1 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.101.139.35/32 brd 10.101.139.35 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 192.168.26.190/32 brd 192.168.26.190 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
7: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 10.244.166.128/32 brd 10.244.166.128 scope global tunl0
       valid_lft forever preferred_lft forever

Node2节点

[root@node2 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:0e:4e:dd brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
       valid_lft 85054sec preferred_lft 85054sec
    inet6 fe80::3b32:152f:273d:43c8/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1300 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 10:00:00:00:00:b2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.26.12/24 brd 192.168.26.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::1200:ff:fe00:b2/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:09:e0:d2:f0 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 8a:6f:d9:a9:99:93 brd ff:ff:ff:ff:ff:ff
6: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 12:42:77:11:42:72 brd ff:ff:ff:ff:ff:ff
    inet 10.101.139.35/32 brd 10.101.139.35 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 192.168.26.190/32 brd 192.168.26.190 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.96.0.10/32 brd 10.96.0.10 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
    inet 10.96.0.1/32 brd 10.96.0.1 scope global kube-ipvs0
       valid_lft forever preferred_lft forever
7: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 10.244.104.0/32 brd 10.244.104.0 scope global tunl0
       valid_lft forever preferred_lft forever
8: cali5f2d86330cb@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
       valid_lft forever preferred_lft forever
9: cali26eb7e820f9@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::ecee:eeff:feee:eeee/64 scope link
       valid_lft forever preferred_lft forever

测试节点

[root@out ~]# arp -an
? (192.168.26.11) at 10:00:00:00:00:b1 [ether] on eth1
? (10.0.2.3) at 52:54:00:12:35:03 [ether] on eth0
? (192.168.26.12) at 10:00:00:00:00:b2 [ether] on eth1
? (192.168.26.10) at 10:00:00:00:00:a0 [ether] on eth1
? (192.168.26.190) at 10:00:00:00:00:b1 [ether] on eth1
? (10.0.2.2) at 52:54:00:12:35:02 [ether] on eth0
[root@out ~]# curl 192.168.26.190
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>

三 工作原理

metallb分为两部分,controller和speaker。

Controller

{"caller":"service.go:98","event":"ipAllocated","ip":"192.168.26.190","msg":"IP address assigned by controller","service":"default/my-nginx","ts":"2020-05-22T02:17:11.742233189Z"}

[root@master ~]# kubectl  logs metallb-controller-75bf779d4f-585mp
{"branch":"HEAD","caller":"main.go:142","commit":"v0.8.1","msg":"MetalLB controller starting version 0.8.1 (commit v0.8.1, branch HEAD)","ts":"2020-05-22T02:17:11.577936238Z","version":"0.8.1"}
{"caller":"main.go:108","configmap":"default/metallb","event":"startUpdate","msg":"start of config update","ts":"2020-05-22T02:17:11.686448912Z"}
{"caller":"main.go:121","configmap":"default/metallb","event":"endUpdate","msg":"end of config update","ts":"2020-05-22T02:17:11.686475979Z"}
{"caller":"k8s.go:376","configmap":"default/metallb","event":"configLoaded","msg":"config (re)loaded","ts":"2020-05-22T02:17:11.68648444Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:11.686507792Z"}
{"caller":"service.go:33","event":"clearAssignment","msg":"not a LoadBalancer","reason":"notLoadBalancer","service":"default/kubernetes","ts":"2020-05-22T02:17:11.686521668Z"}
{"caller":"main.go:75","event":"noChange","msg":"service converged, no change","service":"default/kubernetes","ts":"2020-05-22T02:17:11.686549849Z"}
{"caller":"main.go:76","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:11.686559142Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:11.686570384Z"}
{"caller":"service.go:85","error":"controller not synced","msg":"controller not synced yet, cannot allocate IP; will retry after sync","op":"allocateIP","service":"default/my-nginx","ts":"2020-05-22T02:17:11.686579009Z"}
{"caller":"main.go:72","event":"endUpdate","msg":"end of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:11.686587521Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.686598889Z"}
{"caller":"service.go:33","event":"clearAssignment","msg":"not a LoadBalancer","reason":"notLoadBalancer","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.686606378Z"}
{"caller":"main.go:75","event":"noChange","msg":"service converged, no change","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.68662786Z"}
{"caller":"main.go:76","event":"endUpdate","msg":"end of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.686634351Z"}
{"caller":"main.go:126","event":"stateSynced","msg":"controller synced, can allocate IPs now","ts":"2020-05-22T02:17:11.686645509Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.698513135Z"}
{"caller":"service.go:33","event":"clearAssignment","msg":"not a LoadBalancer","reason":"notLoadBalancer","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.698558483Z"}
{"caller":"main.go:75","event":"noChange","msg":"service converged, no change","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.698596972Z"}
{"caller":"main.go:76","event":"endUpdate","msg":"end of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:11.698605272Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:11.698617575Z"}
{"caller":"service.go:33","event":"clearAssignment","msg":"not a LoadBalancer","reason":"notLoadBalancer","service":"default/kubernetes","ts":"2020-05-22T02:17:11.703655381Z"}
{"caller":"main.go:75","event":"noChange","msg":"service converged, no change","service":"default/kubernetes","ts":"2020-05-22T02:17:11.703710198Z"}
{"caller":"main.go:76","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:11.703726179Z"}
{"caller":"main.go:49","event":"startUpdate","msg":"start of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:11.703745316Z"}
{"caller":"service.go:98","event":"ipAllocated","ip":"192.168.26.190","msg":"IP address assigned by controller","service":"default/my-nginx","ts":"2020-05-22T02:17:11.742233189Z"}

speaker日志

根据apr返回,查找190对应的mac地址,可用确定speaker运行在哪个节点。

pod运行在node1 192.168.26.11节点。arp返回190的mac为node1的eth1网卡,所以请求会发送到node1节点。

{"caller":"main.go:340","event":"serviceAnnounced","ip":"192.168.26.190","msg":"service has IP, announcing","pool":"default","protocol":"layer2","service":"default/my-nginx","ts":"2020-05-22T02:17:11.74920593Z"}

[root@master ~]# kubectl  logs metallb-speaker-kkd5n
{"branch":"main","caller":"main.go:84","commit":"734ee674","msg":"MetalLB speaker starting (commit 734ee674, branch main)","ts":"2020-05-22T02:17:10.094496521Z","version":""}
{"caller":"main.go:105","msg":"Not starting fast dead node detection (MemberList), need ml-bindaddr / ml-labels / ml-namespace config","op":"startup","ts":"2020-05-22T02:17:10.094565059Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"eth0","msg":"created ARP responder for interface","ts":"2020-05-22T02:17:10.096927728Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"eth0","msg":"created NDP responder for interface","ts":"2020-05-22T02:17:10.097172952Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"eth1","msg":"created ARP responder for interface","ts":"2020-05-22T02:17:10.097312492Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"eth1","msg":"created NDP responder for interface","ts":"2020-05-22T02:17:10.097523494Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"docker0","msg":"created ARP responder for interface","ts":"2020-05-22T02:17:10.097732881Z"}
{"caller":"announcer.go:103","event":"createARPResponder","interface":"cali37e7c6d2053","msg":"created ARP responder for interface","ts":"2020-05-22T02:17:10.098019843Z"}
{"caller":"announcer.go:112","event":"createNDPResponder","interface":"cali37e7c6d2053","msg":"created NDP responder for interface","ts":"2020-05-22T02:17:10.098082182Z"}
{"caller":"main.go:383","configmap":"default/metallb","event":"startUpdate","msg":"start of config update","ts":"2020-05-22T02:17:10.234129838Z"}
{"caller":"main.go:407","configmap":"default/metallb","event":"endUpdate","msg":"end of config update","ts":"2020-05-22T02:17:10.234162307Z"}
{"caller":"k8s.go:402","configmap":"default/metallb","event":"configLoaded","msg":"config (re)loaded","ts":"2020-05-22T02:17:10.234171521Z"}
{"caller":"bgp_controller.go:285","event":"nodeLabelsChanged","msg":"Node labels changed, resyncing BGP peers","ts":"2020-05-22T02:17:10.234193311Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:10.234204251Z"}
{"caller":"main.go:268","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:10.234212547Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:10.234221455Z"}
{"caller":"main.go:277","event":"endUpdate","msg":"end of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:10.234227764Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:10.234235443Z"}
{"caller":"main.go:268","event":"endUpdate","msg":"end of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:10.234243163Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:10.23948262Z"}
{"caller":"main.go:268","event":"endUpdate","msg":"end of service update","service":"default/kubernetes","ts":"2020-05-22T02:17:10.239523709Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:10.239534319Z"}
{"caller":"main.go:277","event":"endUpdate","msg":"end of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:10.239540994Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:10.239550003Z"}
{"caller":"main.go:268","event":"endUpdate","msg":"end of service update","service":"kube-system/kube-dns","ts":"2020-05-22T02:17:10.239556402Z"}
{"caller":"main.go:264","event":"startUpdate","msg":"start of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:11.749145884Z"}
{"caller":"main.go:340","event":"serviceAnnounced","ip":"192.168.26.190","msg":"service has IP, announcing","pool":"default","protocol":"layer2","service":"default/my-nginx","ts":"2020-05-22T02:17:11.74920593Z"}
{"caller":"main.go:343","event":"endUpdate","msg":"end of service update","service":"default/my-nginx","ts":"2020-05-22T02:17:11.749256307Z"}

四 异常处理

Speaker Pod挂掉

可以通过增加nodeselector模拟pod挂掉。

Node节点宕机

arp的响应会发生变化。

speaker/main.go:196 watchMemberListEvents

节点驱逐

metallb 选主watch节点变更(watchMemberListEvents),当前节点被kubectl delete node后(当前pod),

其它的speaker pod感知到变化,会重新watch apiserver。所以lb ip会被重新广播。

五 代码分析

https://github.com/huiwq1990/metallb/commits/hg

六 参考

https://www.objectif-libre.com/en/blog/2019/06/11/metallb/

https://metallb.universe.tf/

上一篇下一篇

猜你喜欢

热点阅读