k8s集群及metrics-server、prometheus搭

2020-08-27  本文已影响0人  Li_MAX

准备工作

准备三台Linux服务器


image.png
角色 主机名 IP地址
master k8s 192.168.5.159
node1 k8s-node 192.168.5.160
node2 k8s-node2 192.168.5.161

vim /etc/hosts

192.168.5.159 master
192.168.5.160 node1
192.168.5.161 node2

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld

三台服务器校验系统时间

# 安装ntp
yum install -y ntp
# 同步时间
ntpdate cn.pool.ntp.org

关闭selinux

sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0

关闭swap => K8S中不支持swap分区
编辑etc/fstab将swap那一行注释掉或者删除掉

vim /etc/fstab
#/dev/mapper/centos-swap swap                    swap    defaults        0 0

将桥接的IPv4流量传递到iptables的链

cat > /etc/sysctl.d/k8s.conf << EOF
   net.bridge.bridge-nf-call-ip6tables = 1
   net.bridge.bridge-nf-call-iptables = 1
   EOF

sysctl --system

安装docker、kubeadm、kublet

以下所有节点都需要做的操作

# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
# yum -y install docker-ce-18.06.1.ce-3.el7
# systemctl enable docker && systemctl start docker
# docker --version
Docker version 18.06.1-ce, build e68fc7a

添加阿里云yum软件源

# cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

安装Kubeadm&Kubelet&Kubectl

yum install -y kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3
systemctl enable kubelet

部署Master节点

由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址(registry.aliyuncs.com/google_containers)。官方建议服务器至少2CPU+2G内存

kubeadm init \
--apiserver-advertise-address=192.168.5.159 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.17.3 \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16

配置文件

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# kubectl get nodes

部署pod网络插件

节点全部需部署,被墙原因,曲线救国

docker pull lizhenliang/flannel:v0.11.0-amd64
docker tag lizhenliang/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.12.0-amd64

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

查看状态
全部为Running则OK,其中一个不为Running,比如:Pending、ImagePullBackOff都表明Pod没有就绪

kubectl get pod --all-namespaces

如果其中有的Pod没有Running,可以通过以下命令查看具体错误原因,比如这里我想查看kube-flannel-ds-amd64-8bmbm这个pod的错误信息:

kubectl describe pod kube-flannel-ds-amd64-xpd82 -n kube-system

node节点加入master

kubeadm join 192.168.5.159:6443 --token 1l64hh.7z7xgdjp4bu58720     --discovery-token-ca-cert-hash sha256:8c4bafd2aa326a7c45754f982132a38a8b4f651ca6d052dc4294424e93fe7129

如果master的token找不到,可以使用以下命令查看

kubeadm token list 

token默认有效期24小时,过期后使用该命令无法查看,可通过命令修改

kubeadm token create

获取ca证书sha256编码hash值

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

部署metrics-server

下载部署文件

for file in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml ; do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$file;done
image.png
修改metrics-server-deployment.yaml

被墙原因镜像地址改成了国内的


image.png
image.png
修改resource-reader.yaml

添加nodes/stats


image.png
kubectl apply -f .
image.png

测试


image.png
image.png

metrics-server API使用:

<meta charset="utf-8">

Metrics-server 可用 API 列表如下:

由于 k8s 在 v1.10 后废弃了 8080 端口,可以通过代理或者使用认证的方式访问这些 API:

$ kubectl proxy
$ curl http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
{
  "kind": "NodeMetricsList",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
  },
  "items": [
    {
      "metadata": {
        "name": "node2",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node2",
        "creationTimestamp": "2020-08-28T02:24:07Z"
      },
      "timestamp": "2020-08-28T02:23:42Z",
      "window": "30s",
      "usage": {
        "cpu": "37549321n",
        "memory": "302864Ki"
      }
    },
    {
      "metadata": {
        "name": "master",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/master",
        "creationTimestamp": "2020-08-28T02:24:07Z"
      },
      "timestamp": "2020-08-28T02:24:30Z",
      "window": "30s",
      "usage": {
        "cpu": "174668532n",
        "memory": "1105964Ki"
      }
    },
    {
      "metadata": {
        "name": "node1",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/node1",
        "creationTimestamp": "2020-08-28T02:24:07Z"
      },
      "timestamp": "2020-08-28T02:23:43Z",
      "window": "30s",
      "usage": {
        "cpu": "22156105n",
        "memory": "362676Ki"
      }
    }
  ]
}

也可以直接通过 kubectl 命令来访问这些 API,比如:

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes/<node-name>
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespace/<namespace-name>/pods/<pod-name>

prometheus搭建

git clone https://github.com/iKubernetes/k8s-prom.git

cd k8s-prom

# 创建prom的名称空间
[root@master k8s-prom]# kubectl apply -f namespace.yaml
namespace/prom created

# 部署node_exporter:
[root@master k8s-prom]# cd node_exporter/
[root@master node_exporter]# ls
node-exporter-ds.yaml  node-exporter-svc.yaml
[root@master node_exporter]# kubectl apply -f .
daemonset.apps/prometheus-node-exporter created
service/prometheus-node-exporter created

[root@master node_exporter]# kubectl get pods -n prom
NAME                             READY     STATUS    RESTARTS   AGE
prometheus-node-exporter-dmmjj   1/1       Running   0          7m
prometheus-node-exporter-ghz2l   1/1       Running   0          7m
prometheus-node-exporter-zt2lw   1/1       Running   0          7m

# 部署prometheus
[root@master k8s-prom]# cd prometheus/
[root@master prometheus]# ls
prometheus-cfg.yaml  prometheus-deploy.yaml  prometheus-rbac.yaml  prometheus-svc.yaml
[root@master prometheus]# kubectl apply -f .
configmap/prometheus-config created
deployment.apps/prometheus-server created
clusterrole.rbac.authorization.k8s.io/prometheus created
serviceaccount/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
service/prometheus created

看prom名称空间中的所有资源

[root@master prometheus]# kubectl logs prometheus-server-556b8896d6-dfqkp -n prom 
Warning  FailedScheduling  2m52s (x2 over 2m52s)  default-scheduler  0/3 nodes are available: 3 Insufficient memory.

修改prometheus-deploy.yaml,删掉内存那三行

resources:
  limits:
    memory: 2Gi

重新apply

[root@master prometheus]# kubectl apply -f prometheus-deploy.yaml
[root@master prometheus]# kubectl get all -n prom
NAME                                     READY     STATUS    RESTARTS   AGE
pod/prometheus-node-exporter-dmmjj       1/1       Running   0          10m
pod/prometheus-node-exporter-ghz2l       1/1       Running   0          10m
pod/prometheus-node-exporter-zt2lw       1/1       Running   0          10m
pod/prometheus-server-65f5d59585-6l8m8   1/1       Running   0          55s
NAME                               TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
service/prometheus                 NodePort    10.111.127.64   <none>        9090:30090/TCP   56s
service/prometheus-node-exporter   ClusterIP   None            <none>        9100/TCP         10m
NAME                                      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-node-exporter   3         3         3         3            3           <none>          10m
NAME                                DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-server   1         1         1            1           56s
NAME                                           DESIRED   CURRENT   READY     AGE
replicaset.apps/prometheus-server-65f5d59585   1         1         1         56s

上面我们看到通过NodePorts的方式,可以通过宿主机的30090端口,来访问prometheus容器里面的应用。


image.png

部署kube-state-metrics,用来整合数据

[root@master k8s-prom]# cd kube-state-metrics/
[root@master kube-state-metrics]# ls
kube-state-metrics-deploy.yaml  kube-state-metrics-rbac.yaml  kube-state-metrics-svc.yaml
[root@master kube-state-metrics]# kubectl apply -f .
deployment.apps/kube-state-metrics created
serviceaccount/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
service/kube-state-metrics created

镜像被墙原因无法拉取,可拉国内再打tag

docker pull quay.io/coreos/kube-state-metrics:v1.3.1
docker tag quay.io/coreos/kube-state-metrics:v1.3.1   gcr.io/google_containers/kube-state-metrics-amd64:v1.3.1

部署k8s-prometheus-adapter,这个需要自制证书:

[root@master k8s-prometheus-adapter]# cd /etc/kubernetes/pki/
[root@master pki]# (umask 077; openssl genrsa -out serving.key 2048)
Generating RSA private key, 2048 bit long modulus
...........................................................................................+++
...............+++
e is 65537 (0x10001)

证书请求:

[root@master pki]#  openssl req -new -key serving.key -out serving.csr -subj "/CN=serving"

开始签证:

[root@master pki]# openssl  x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
Signature ok
subject=/CN=serving
Getting CA Private Key

创建加密的配置文件:
注:cm-adapter-serving-certs是custom-metrics-apiserver-deployment.yaml文件里面的名字。

[root@master pki]# kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key=./serving.key  -n prom
secret/cm-adapter-serving-certs created
[root@master pki]# kubectl get secrets -n prom
NAME                             TYPE                                  DATA      AGE
cm-adapter-serving-certs         Opaque                                2         51s
default-token-knsbg              kubernetes.io/service-account-token   3         1h
kube-state-metrics-token-sccdf   kubernetes.io/service-account-token   3         1h
prometheus-token-nqzbz           kubernetes.io/service-account-token   3         1h

部署k8s-prometheus-adapter:

[root@master k8s-prom]# cd k8s-prometheus-adapter/
[root@master k8s-prometheus-adapter]# ls
custom-metrics-apiserver-auth-delegator-cluster-role-binding.yaml   custom-metrics-apiserver-service.yaml
custom-metrics-apiserver-auth-reader-role-binding.yaml              custom-metrics-apiservice.yaml
custom-metrics-apiserver-deployment.yaml                            custom-metrics-cluster-role.yaml
custom-metrics-apiserver-resource-reader-cluster-role-binding.yaml  custom-metrics-resource-reader-cluster-role.yaml
custom-metrics-apiserver-service-account.yaml                       hpa-custom-metrics-cluster-role-binding.yaml

会遇到不兼容问题,解决方法访问https://github.com/DirectXMan12/k8s-prometheus-adapter/tree/master/deploy/manifests下载最新版的custom-metrics-apiserver-deployment.yaml文件,并把里面的namespace的名字改成prom;同时还要下载custom-metrics-config-map.yaml文件到本地来,并把里面的namespace的名字改成prom。

[root@master k8s-prometheus-adapter]# kubectl apply -f .

查看状态,全部running状态

[root@master k8s-prometheus-adapter]# kubectl get all -n prom

查看api是否存在 custom.metrics.k8s.io/v1beta1

[root@master k8s-prometheus-adapter]# kubectl api-versions
custom.metrics.k8s.io/v1beta1

开代理测试

[root@master k8s-prometheus-adapter]# kubectl proxy --port=8080
[root@master pki]# curl  http://localhost:8080/apis/custom.metrics.k8s.io/v1beta1/
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "namespaces/kube_endpoint_info",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "namespaces/kube_hpa_status_desired_replicas",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "namespaces/kube_pod_container_status_waiting",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "namespaces/kube_hpa_labels",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "jobs.batch/kube_hpa_spec_min_replicas",
      "singularName": "",
      "namespaced": true,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    }
}

问题:

1、node节点 使用 kubectl查询资源报错:kubernetes:The connection to the server localhost:8080 was refused - did you specify the right host

将master节点的admin.conf 拷贝到work节点$HOME/.kube目录下,文件改名config

2、证书已存在,删除对应目录重新执行命令

3、kubelet和kubeadm版本不一致问题,重新安装

yum remove kubelet
yum install -y kubelet-1.17.3 kubeadm-1.17.3 kubectl-1.17.3
上一篇下一篇

猜你喜欢

热点阅读