生产环境下kubernetes的安装及测试过程记录
2020-05-04 本文已影响0人
JohnYuCN
实验环境:
阿里云一主两从
-
master:
主机名:ali005,
OS:ubuntu18.04
配置:2vCPU,4G内存,40G硬盘 -
slave:
主机名:ali001,
OS:ubuntu18.04
配置:1vCPU,2G内存,40G硬盘 -
slave:
主机名:ali007,
OS:ubuntu18.04
配置:1vCPU,2G内存,40G硬盘
一、配置主节点:
0. 完成docker安装
注意:不要使用snap方式安装
1. 新增阿里云的k8s源
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat << EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
2. 安装三大工具
kubeadm:
the command to bootstrap the cluster.
kubelet:
the component that runs on all of the machines in your cluster and does things like starting pods and containers.
kubectl:
the command line util to talk to your cluster.
apt-get update
apt-get install -y kubelet kubeadm kubectl
3. 导出kubeadam的配置文件:
kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml
4. 按注释处进行修改kubeadam.yml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 主节点 IP
advertiseAddress: 172.26.195.120
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: ali005
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
# 修改为阿里云的镜像
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.18.0
networking:
dnsDomain: cluster.local
# 设置pod的网络地址子网段(使用calico必须如下指定,如果使用flannel则为:10.244.0.0/16)
podSubnet: "192.168.0.0/16"
serviceSubnet: 10.96.0.0/12
scheduler: {}
注:以上方案也可以改为直接命令行(暂时未尝试):
参考网址:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
root@ali005:~# kubeadm init --apiserver-advertise-address 172.26.195.120 \
--pod-network-cidr=192.168.0.0/16 \
--token-ttl 24h0m0s \
--image-repository registry.aliyuncs.com/google_containers
5. 查看并拉取镜像:
# 查看所需镜像列表
kubeadm config images list --config kubeadm.yml
# 拉取镜像
kubeadm config images pull --config kubeadm.yml
此时已完成所有所需要的镜像的加下载工作。
6. 安装及配置k8s的主节点
kubeadm init --config=kubeadm.yml --upload-certs | tee kubeadm-init.log
此时,查看结果或查看kubeadm-init.log
都会看到最后的结果,做为slave机加入的凭证:
kubeadm join 172.26.195.120:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b27f3d8aeb3509dd19f8281797e0118e1dd241046e3914c712dc3a34936609f4
7. 配置kubectl工具
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
8. 验证:
kubectl get node
root@ali005:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ali005 NotReady master 123m v1.18.2
二、配置slave节点机:
1. 完成 (一)中的0,1,2三个步骤
2. 加入集群(确保内网的联通性):
kubeadm join 172.26.195.120:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b27f3d8aeb3509dd19f8281797e0118e1dd241046e3914c712dc3a34936609f4
3. 在master机上测试
root@ali005:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ali005 NotReady master 151m v1.18.2
ali007 NotReady <none> 23s v1.18.2
root@ali005:~# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-7ff77c879f-ngh8q 0/1 Pending 0 152m <none> <none> <none> <none>
coredns-7ff77c879f-x8cql 0/1 Pending 0 152m <none> <none> <none> <none>
etcd-ali005 1/1 Running 0 152m 172.26.195.120 ali005 <none> <none>
kube-apiserver-ali005 1/1 Running 0 152m 172.26.195.120 ali005 <none> <none>
kube-controller-manager-ali005 1/1 Running 0 152m 172.26.195.120 ali005 <none> <none>
kube-proxy-plgwp 1/1 Running 0 152m 172.26.195.120 ali005 <none> <none>
kube-proxy-vq5nb 1/1 Running 0 95s 172.26.195.121 ali007 <none> <none>
kube-scheduler-ali005 1/1 Running 0 152m 172.26.195.120 ali005 <none> <none>
三、安装网络:
1.在master机上,安装calico:calico的官网
root@ali005:~# kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
2. 查看是否安装成功(见到以下图示大约需要2-3个小时)
root@ali005:~# watch kubectl get pods --all-namespaces
Every 2.0s: kubectl get pods --all-namespaces ali005: Tue May 5 06:57:12 2020
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6fcbbfb6fb-99ngk 1/1 Running 0 8h
kube-system calico-node-crlh7 1/1 Running 0 8h
kube-system calico-node-p8l75 1/1 Running 0 5m1s
kube-system calico-node-xbtpg 1/1 Running 0 7h11m
kube-system coredns-7ff77c879f-ngh8q 1/1 Running 0 10h
kube-system coredns-7ff77c879f-x8cql 1/1 Running 0 10h
kube-system etcd-ali005 1/1 Running 0 10h
kube-system kube-apiserver-ali005 1/1 Running 0 10h
kube-system kube-controller-manager-ali005 1/1 Running 0 10h
kube-system kube-proxy-67572 1/1 Running 0 7h11m
kube-system kube-proxy-plgwp 1/1 Running 0 10h
kube-system kube-proxy-t5m7p 1/1 Running 0 5m1s
kube-system kube-scheduler-ali005 1/1 Running 3 10h
3. 注:在使用 watch kubectl get pods --all-namespaces 命令观察 Pods 状态时如果出现 ImagePullBackOff 无法 Running 的情况,请尝试使用如下步骤处理:
Master 中删除 Nodes:kubectl delete nodes <NAME>
Slave 中重置配置:kubeadm reset
Slave 重启计算机:reboot
Slave 重新加入集群:kubeadm join
四、启动一个Nginx集群
1. 查看各服务组件的状态:
root@ali005:~# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
2. 检查master 节点的状态:
root@ali005:~# kubectl cluster-info
Kubernetes master is running at https://172.26.195.120:6443
KubeDNS is running at https://172.26.195.120:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
3. 检查各node节点:
root@ali005:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ali001 Ready <none> 8h v1.18.2
ali005 Ready master 18h v1.18.2
ali007 Ready <none> 15h v1.18.2
4. 新建一个nginx服务配置文件:
vim nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
# 创建2个nginx容器
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
以上注明:
- 服务名和发布名称均为:
nginx-deployment
- 产生了两个复制
- 容器的名字:nginx
5. 启动服务并查看:
root@ali005:~# kubectl apply -f nginx-deployment.yaml
root@ali005:~# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 19h
nginx-deployment LoadBalancer 10.96.26.13 <pending> 80:31160/TCP 8h
root@ali005:~# kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 2/2 2 2 8h
root@ali005:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 8h
nginx-deployment-6b474476c4-28npc 1/1 Running 0 8h
nginx-deployment-6b474476c4-rn54n 1/1 Running 0 8h
6. 服务暴露
root@ali005:~# kubectl expose deployment nginx-deployment --port=80 --target-port=8000 --type=LoadBalancer
或者
root@ali005:~# kubectl expose deployment nginx-deployment --port=80 --type=LoadBalancer
root@ali005:~# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 11h
nginx-deployment LoadBalancer 10.96.26.13 <pending> 80:31160/TCP 57m
此时的31160即为服务对外暴露的端口!
7. 查看服务详情:
root@ali005:~# kubectl describe service nginx-deployment
Name: nginx-deployment
Namespace: default
Labels: app=nginx
Annotations: <none>
Selector: app=nginx
Type: LoadBalancer
IP: 10.96.26.13
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 31160/TCP
Endpoints: 192.168.127.66:80,192.168.127.67:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
9. 更新容器所使用的镜像版本
root@ali005:~# kubectl set image deployment/nginx-deployment nginx=nginx:1.10
deployment.apps/nginx-deployment image updated
10. 扩容(缩容)pod服务的数量
root@ali005:~# kubectl scale deployment/nginx-deployment --replicas=3
deployment.apps/nginx-deployment scaled
11. 查看pod的情况:
root@ali005:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-cdbc5548b-gvrrf 1/1 Running 0 9m41s
nginx-deployment-cdbc5548b-m2ft7 1/1 Running 0 21m
nginx-deployment-cdbc5548b-tjdxs 1/1 Running 0 21m
root@ali005:~# kubectl describe pod nginx-deployment-cdbc5548b-gvrrf
Name: nginx-deployment-cdbc5548b-gvrrf
Namespace: default
Priority: 0
Node: ali001/172.26.138.7
Start Time: Sun, 10 May 2020 12:49:54 +0800
Labels: app=nginx
pod-template-hash=cdbc5548b
Annotations: cni.projectcalico.org/podIP: 192.168.127.77/32
cni.projectcalico.org/podIPs: 192.168.127.77/32
Status: Running
IP: 192.168.127.77
.12 删除服务:
root@ali005:~# kubectl delete deployment nginx-deployment
deployment.apps "nginx-deployment" deleted
root@ali005:~# kubectl delete service nginx-deployment
service "nginx-deployment" deleted