云原生

十五 Kubernetes容器日志收集

2022-05-19  本文已影响0人  負笈在线

(一) Kubernetes日志收集

1. Kubernetes需要收集哪些日志?

2.收集日志常用的技术栈
ELK日志流程可以有多种方案(不同组件可自由组合,根据自身业务配置),常见有以下:
1、 Logstash(采集、处理)—> ElasticSearch (存储)—>Kibana (展示)
2、Logstash(采集)—> Logstash(聚合、处理)—> ElasticSearch (存储)—>Kibana (展示)
3、 Filebeat(采集、处理)—> ElasticSearch (存储)—>Kibana (展示)
4、Filebeat(采集)—> Logstash(聚合、处理)—> ElasticSearch (存储)—>Kibana (展示)

上面几种是基本的日志收集平台,如果要采集大日志,避免日志阻塞,可以采用如下方法:
Filebeat(采集)—> Kafka/Redis(消峰) —> Logstash(聚合、处理)—> ElasticSearch (存储)—>Kibana (展示)

3.ElasticSearch+Fluentd+Kibana架构解析

(二) 使用EFK收集控制台日志

1.部署Elasticsearch+Fluentd+Kibana

本次实验的环境如下,服务器可用资源2核4G以上:

# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready <none> 88d v1.23.0
k8s-master02 Ready <none> 88d v1.23.0
k8s-master03 Ready <none> 88d v1.23.0
k8s-node01 Ready <none> 88d v1.23.0
k8s-node02 Ready <none> 88d v1.23.0

下载需要的部署文件:

# git clone https://github.com/dotbalo/k8s.git
# cd k8s/efk-7.10.2/

创建EFK所用的命名空间:

[root@k8s-master01 efk-7.10.2]# kubectl create -f create-logging-namespace.yaml
namespace/logging created

创建Elasticsearch集群(企业内已有ELK平台可以不创建):

[root@k8s-master01 efk-7.10.2]# kubectl create -f es-service.yaml
service/elasticsearch-logging created
[root@k8s-master01 efk-7.10.2]# kubectl create -f es-statefulset.yaml
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created

创建Kibana(企业内已有ELK平台可以不创建):

[root@k8s-master01 efk-7.10.2]# kubectl create -f kibana-deployment.yaml -f kibana-service.yaml
deployment.apps/kibana-logging created
service/kibana-logging created

由于在Kubernetes集群中,我们可能并不需要对所有的机器都采集日志,所以可以更改Fluentd的部署文件如下,添加一个NodeSelector,只部署至需要采集的主机即可:

# grep "nodeSelector" fluentd-es-ds.yaml -A 3
 nodeSelector:
 fluentd: "true"
 ...

之后给需要采集日志的节点打上一个标签,以k8s-node01为例:

# kubectl label node k8s-node01 fluentd=true
# kubectl get node -l fluentd=true --show-labels
NAME STATUS ROLES AGE VERSION LABELS
k8s-node01 Ready <none> 88d vl.21.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,**fluentd=true**,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,node.kubernetes.io/node=

创建Fluentd:

[root@k8s-master01 efk-7.10.2]# kubectl create -f fluentd-es-ds.yaml -f fluentd-es-configmap.yaml
serviceaccount/fluentd-es created
clusterrole.rbac.authorization.k8s.io/fluentd-es created
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es created
daemonset.apps/fluentd-es-v3.1.1 created
configmap/fluentd-es-config-v0.2.1 created

Fluentd的ConfigMap有个字段需要注意,在fluentd-es-configmap.yaml最后有一个

output.conf:
 output.conf: |-
 <match **>
 ...
 host elasticsearch-logging
 port 9200
 ...

2.Kibana使用

确认创建的Pod都已经成功启动:

# kubectl get po -n logging
NAME                   READY STATUS RESTARTS AGE
elasticsearch-logging-0 1/1 Running 0 15h
elasticsearch-logging-1 1/1 Running 0 15h
fluentd-es-v3.l.1-p4zbk 1/1 Running 17 15h
kibana-logging-75bd6cchf5-psda5 1/1 Running 7 15h

接下来查看Kibana暴露的端口号,访问Kibana:

# kubectl get svc -n logging
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch-logging ClusterIP None <none> 9200/TCP,9300/TCP 15h
kibana-logging NodePort 192.168.148.65 <none> 5601:**32664**/TCP 15h

使用任意一个部署了kube-proxy服务的节点的IP+32664端口即可访问Kibana:



点击Explore on my own,之后再点击Visualize:



之后点击Add your data --》》Create index pattern:


在Index pattern name输入索引名logstash*,然后点击Next Step:



再次选择timestamp,即可创建索引:

之后点击菜单栏àDiscover即可看到相关日志:

(三) 使用Filebeat收集自定义文件日志

1.创建Kafka和Logstash

首先需要部署Kafka和Logstash至Kubernetes集群,如果企业内已经有比较成熟的技术栈,可以无需部署,直接将Filebeat的输出指向外部Kafka集群即可:

# cd filebeat
# helm install zookeeper zookeeper/ -n logging
# kubectl get po -n logging -l app.kubernetes.io/name=zookeeper
NAME READY STATUS RESTARTS AGE
zookeeper-0 1/1 Running 0 51s
# helm install kafka kafka/ -n logging
# kubectl get po -n logging -l app.kubernetes.io/component=kafka
NAME READY STATUS RESTARTS AGE
kafka-0 1/1 Running 0 43s

待Pod都正常后,创建Logstash服务:

# kubectl create -f logstash-service.yaml -f logstash-cm.yaml -f logstash.yaml -n logging
service/logstash-service created
configmap/logstash-configmap created
deployment.apps/logstash-deployment created

需要注意logstash-cm.yaml文件中的一些配置:

l input:数据来源,本次示例配置的是Kakfa;
l input.kafka.bootstrap_servers:Kafka地址,由于是安装在集群内部的,可以直接使用Kafka集群的Service接口,如果是外部地址,按需配置即可;
l input.kafka.topics:Kafka的topic,需要和Filebeat输出的topic一致;
l input.kafka.type:定义一个type,可以用于logstash输出至不同的Elasticsearch集群;
l output:数据输出至哪里,本次示例输出至Elasticsearch集群,在里面配置了一个判断语句,当type为filebeat-sidecar时,将会输出至Elasticsearch集群,并且index为filebeat-xxx。

2.注入Filebeat Sidecar

接下来创建一个模拟程序:

# kubectl create -f app.yaml -nlogging

该程序会一直在/opt/date.log文件输出当前日期,配置如下:

 command:
 - sh
 - -c
 - while true; do date >> /opt/date.log; sleep 2; done

成功启动后,可以查看该文件的内容:

# kubectl get po -nlogging
NAME READY STATUS RESTARTS AGE
app-6dd64bdc55-8wzjc 1/1 Running 0 19m
# kubectl logs app-6dd64bdc55-8wzjc  -nlogging
# kubectl exec app-6dd64bdc55-8wzjc -nlogging -- tail -1 /opt/date.log
Wed Jul 10 09:10:06 UTC 2021

此时如果去Kibana检索该日志,是无法查到该日志的,因为Fluentd无法采集内部的日志文件。接下来改造该程序,添加Filebeat至该部署文件(只展示了部分内容):

# cat app-filebeat.yaml 
...
      containers:
        - name: filebeat                        
          image: registry.cn-beijing.aliyuncs.com/dotbalo/filebeat:7.10.2 
          imagePullPolicy: IfNotPresent
          env:
            - name: podIp
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.podIP
            - name: podName
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: podNamespace
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: podDeployName
              value: app
            - name: TZ
              value: "Asia/Shanghai"
          securityContext:
            runAsUser: 0
          volumeMounts:
            - name: logpath
              mountPath: /data/log/app/
            - name: filebeatconf
              mountPath: /usr/share/filebeat/filebeat.yml 
              subPath: usr/share/filebeat/filebeat.yml
        - name: app
          image: registry.cn-beijing.aliyuncs.com/dotbalo/alpine:3.6 
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: logpath
              mountPath: /opt/
          env:
            - name: TZ
              value: "Asia/Shanghai"
            - name: LANG
              value: C.UTF-8
            - name: LC_ALL
              value: C.UTF-8
          command:
            - sh
            - -c
            - while true; do date >> /opt/date.log; sleep 2;  done 
      volumes:
        - name: logpath
          emptyDir: {}
        - name: filebeatconf
          configMap:
            name: filebeatconf
            items:
              - key: filebeat.yml
                path: usr/share/filebeat/filebeat.yml

可以看到在Deployment部署文件中,添加了Volumes配置,并配置了一个名为logpath的volume,将其挂载到了应用容器的/opt/目录和Filebeat的/data/log/app/目录,这样同一个Pod内的两个容器就实现了目录的共享。
之后创建一个Filebeat的配置文件,采集该目录下的日志即可:

# cat filebeat-cm.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeatconf
data:
  filebeat.yml: |-
    filebeat.inputs:
    - input_type: log
      paths:
        - /data/log/*/*.log
      tail_files: true
      fields:
        pod_name: '${podName}'
        pod_ip: '${podIp}'
        pod_deploy_name: '${podDeployName}'
        pod_namespace: '${podNamespace}'
    output.kafka:
      hosts: ["kafka:9092"]
      topic: "filebeat-sidecar"
      codec.json:
        pretty: false
      keep_alive: 30s

需要注意paths是配置的共享目录,output.kafka需要和logstash的kafka为同一个集群,并且topic和logstash消费的topic为同一个。之后注入Filebeat:

# kubectl apply -f filebeat-cm.yaml -f app-filebeat.yaml -n logging
configmap/filebeatconf created
deployment.apps/app configured

之后在Kibana上添加Filebeat的索引即可查看日志,添加步骤和EFK一致,只需要更改索引名即可。首先点击菜单栏àStack Management:



然后点击添加索引:



在Index pattern name输入filebeat*,最后点击Next step即可:

添加完成后,在Kibana上选择filebeat的索引即可查看日志,在此不再演示。

3.清理环境

如果是学习环境,做完练习后可以清理数据:

# helm delete kafka zookeeper -n logging
# kubectl delete -f . -n logging
# cd ..
# ls
create-logging-namespace.yaml es-service.yaml es-statefulset.yaml filebeat fluentd-es-configmap.yaml fluentd-es-ds.yaml kafka kibana-deployment.yaml kibana-service.yaml
# kubectl delete -f . -n logging

(四) Loki初体验

1.安装Loki Stack

Loki提供了Helm的安装方式,可以直接下载包进行安装即可,首先添加并更新Loki的Helm仓库:

# helm repo add grafana https://grafana.github.io/helm-charts
"grafana" has been added to your repositories
# helm repo update
Hang tight while we grab the latest from your chart repositories...
...
...Successfully got an update from the "grafana" chart repository

创建Loki Namespace:

# kubectl create ns loki
namespace/loki created

创建Loki Stack:

# helm upgrade --install loki grafana/loki-stack --set grafana.enabled=true --set grafana.service.type=NodePort -n loki
NAME: loki
NAMESPACE: loki
STATUS: deployed
REVISION: 1
NOTES:
The Loki stack has been deployed to your cluster. Loki can now be added as a datasource in Grafana.
See http://docs.grafana.org/features/datasources/loki/ for more detail.

查看Pod状态:

# kubectl get po -n loki
NAME READY STATUS   RESTARTS AGE
loki-0 1/1 Running 0 4m23s
loki-grafana-5b57955f9d-48z6l 1/1 Running 0 4m23s
loki-promtail-2n24m 1/1 Running 0 4m23s
loki-promtail-59slx 1/1   Running 0 4m23s
loki-promtail-6flzq 1/1 Running 0 4m23s
loki-promtail-gq2hk 1/1 Running 0 4m23s
loki-promtail-sqwtv 1/1 Running 0 4m23s

查看Grafana的Service暴露的端口号:

# kubectl get svc -n loki
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loki ClusterIP 192.168.67.18 <none> 3100/TCP 7m54s
loki-grafana NodePort 192.168.103.121 <none> 80:**31053**/TCP 7m54s
loki-headless ClusterIP None <none> 3100/TCP 7m54s

之后通过任意一个安装了kube-proxy的节点的IP加上31053即可访问Grafana:


查看Grafana密码(账号admin):
# kubectl get secret --namespace loki loki-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
eD47DAbzhyPLgDeSM8C0LvBi3DksU73vZND8t4h0

登录后,按图示点击即可查看Loki语法指南:


其它安装配置可以参考:https://grafana.com/docs/loki/latest/installation/helm/

2.Loki语法入门

Loki是参考Prometheus进行设计的,所以查询语法和PromQL类似,比如查询命名空间为kube-system下的所有Pod的日志,只需要在Log browser输入{namespace="kube-system"},然后点击Run query即可:



在图示的最下方为日志详情,如下所示:



可以看到每条日志都被添加了很多的标签,之后可以通过标签过滤日志,比如查看命名空间为kube-system,且Pod名称包含calico的日志(下面语法将不再提供截图,读者可以自行测试):
{namespace="kube-system", pod=~"calico.*"}
相当于=~,类似的语法还有:
l =~ :正则模糊匹配
l = :完全匹配
l != :不等于
l !~ :正则模糊不等于

Loki语法同时支持Pipeline,比如需要过滤日志包含avg字段的日志:

{namespace="kube-system", pod=~"calico.*"} |~ "avg"

查询到的日志如下:



还可以使用logfmt对日志进行格式化,然后可以进行值的判断,比如找到包含avg字段,并且longest的值大于16ms的日志:

{namespace="kube-system", pod=~"calico.*"} |~ "avg" | logfmt | longest > 16ms

得到的日志如下,和上个结果相比,只有longest大于16ms的日志,其它的均已被过滤掉:



值匹配可以判断多个条件,比如longest大于16ms且avg大于6ms的日志:

上述演示的语法为最基本也是最常用的语法,如果想要了解更多复杂的语法,可以查看LogQL的官方文档:https://grafana.com/docs/loki/latest/logql/

上一篇 下一篇

猜你喜欢

热点阅读