Consul节点创建与集群的搭建(含单机多节点集群)

2020-06-23  本文已影响0人  idriss

一、关于Consul

1、Consul简介

ConsulHashiCorp 公司的一个用于实现分布式系统的分布式、高可用、高可横向扩展的服务发现与配置工具。Consul内置了服务注册与发现框 架、分布一致性协议实现、健康检查、Key/Value存储、多数据中心方案。Consul具有功能完善、部署简单、使用方便等特点。

类似的做服务发现与注册的框架还有:

2、Consul架构

consul-arch.png

3、Consul特性

4、Consul常见使用场景

Consul的应用场景包括服务发现、服务隔离、服务配置

二、启动dev模式单节点

此模式适合日常开发环境中调试使用,如果您对数据持久化与服务可靠性有较高要求,请跳过此模式。

1、下载并安装Consul

使用wget下载安装包至/data/pkgs目录:

cd /data/pkgs
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip

将安装包解压至/data/services目录,并重命名为consul:

unzip consul_1.7.3_linux_amd64.zip -d /data/services/consul
 Archive:  consul_1.7.3_linux_amd64.zip
 inflating: /data/services/consul/consul

查看目录信息:

ls -l /data/services/consul
total 105444
-rwxr-xr-x 1 centos centos 107970750 May  6 06:50 consul

2、配置Consul

通过上一步的操作,已经基本完成了consul软件的安装,但为了使用方便,我们还需要将consul可执行文件所在目录加入PATH,以方便在任何地方调用还需要创建配置文件目录来避免启动时需要加很多参数,另外需要将日志写入特定目录,避免consul日志大量写入系统日志。

a.创建相应目录,匹配环境变量

创建bin目录、配置文件目录、日志目录:

cd /data/services/consul
mkdir {bin,log,conf,data}

将consul可执行文件移动到bin目录:

mv consul bin/

将consul的bin目录添加到PATH,以下操作需要sudoroot权限:

sudo vim /etc/profile.d/consul.sh     #为了避免误操作或环境变量配置不当导致系统命令失效,建议以服务或软件名的形式在/etc/profile.d目录下创建环境变量配置脚本,便于维护,不用时移除特定的脚本即可,对系统影响较小
脚本内容如下:
export CONSUL_HOME=/data/services/consul
export PATH=${PATH}:${CONSUL_HOME}/bin
#使用source执行脚本,使配置生效
source /etc/profile.d/consul.sh

至此,consul已经被添加到系统PATH,可以在任意目录下进行调用consul命令。如下:

cd /data/services
consul help
Usage: consul [--version] [--help] <command> [<args>]​
Available commands are:
 acl            Interact with Consul's ACLs
 agent          Runs a Consul agent
 catalog        Interact with the catalog
 config         Interact with Consul's Centralized Configurations
 connect        Interact with Consul Connect
 debug          Records a debugging archive for operators
 event          Fire a new event
 exec           Executes a command on Consul nodes
 force-leave    Forces a member of the cluster to enter the "left" state
 info           Provides debugging information for operators.
 intention      Interact with Connect service intentions
 join           Tell Consul agent to join cluster
 keygen         Generates a new encryption key
 keyring        Manages gossip layer encryption keys
 kv             Interact with the key-value store
 leave          Gracefully leaves the Consul cluster and shuts down
 lock           Execute a command holding a lock
 login          Login to Consul using an auth method
 logout         Destroy a Consul token created with login
 maint          Controls node or service maintenance mode
 members        Lists the members of a Consul cluster
 monitor        Stream logs from a Consul agent
 operator       Provides cluster-level tools for Consul operators
 reload         Triggers the agent to reload configuration files
 rtt            Estimates network round trip time between nodes
 services       Interact with services
 snapshot       Saves, restores and inspects snapshots of Consul server state
 tls            Builtin helpers for creating CAs and certificates
 validate       Validate config files/directories
 version        Prints the Consul version
 watch          Watch for changes in Consul

b.创建配置文件

cd /data/services/consul/conf
vim dev.json
{
 "bind_addr": "10.100.0.2",
 "client_addr": "10.100.0.2",
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/data",
 "log_level": "INFO",
 "log_file": "/data/services/consul/log/consul.log",   #配置日志文件与目录
 "log_rotate_duration": "24h",  #设置日志轮转
 "enable_syslog": false,      #禁止consul日志写入系统日志
 "enable_debug": true,
 "node_name": "Consul",
 "ui": true
}

3、以dev模式启动Consul服务

consul agent -dev -config-dir=/data/services/consul/conf
==> Starting Consul agent...
 Version: 'v1.7.3'
 Node ID: '0e2d44c2-af33-e222-5eb5-58b2c1f903d5'
 Node name: 'Consul'
 Datacenter: 'dc1' (Segment: '<all>')
 Server: true (Bootstrap: false)
 Client Addr: [10.100.0.2] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
 Cluster Addr: 10.100.0.2 (LAN: 8301, WAN: 8302)
 Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
​
==> Log data will now stream in as it occurs:
​
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:0e2d44c2-af33-e222-5eb5-58b2c1f903d5 Address:10.100.0.2:8300}]"
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: entering follower state: follower="Node at 10.100.0.2:8300 [Follower]" leader=
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: Consul.dc1 10.100.0.2
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started DNS server: address=10.100.0.2:8600 network=tcp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started DNS server: address=10.100.0.2:8600 network=udp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started HTTP server: address=10.100.0.2:8500 network=tcp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started gRPC server: address=10.100.0.2:8502 network=tcp
 2020-06-18T15:11:48.437+0800 [INFO]  agent: started state syncer
==> Consul agent running!
 2020-06-18T15:11:48.489+0800 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: entering candidate state: node="Node at 10.100.0.2:8300 [Candidate]" term=2
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: election won: tally=1
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: entering leader state: leader="Node at 10.100.0.2:8300 [Leader]"
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server: cluster leadership acquired
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server: New leader elected: payload=Consul
 2020-06-18T15:11:48.501+0800 [INFO]  agent.server.connect: initialized primary datacenter CA with provider: provider=consul
 2020-06-18T15:11:48.501+0800 [INFO]  agent.leader: started routine: routine="CA root pruning"
 2020-06-18T15:11:48.501+0800 [INFO]  agent.server: member joined, marking health alive: member=Consul
 2020-06-18T15:11:48.615+0800 [INFO]  agent: Synced node info

看到上述信息,则说明已经成功以dev模式运行了consul服务。

4、配置Consul优雅启动与重启

在上一步的启动操作中,需要使用命令行带参数启动,为了方便管理,将consul服务添加到systemd,可以进行优雅启动与停止。为了服务安全,可以先创建一个不可登录shell类型的用户来对consul服务进行管理:

sudo useradd -M -s /sbin/nologin consul

更改consul服务目录属主:

sudo chown -R consul.consul /data/services/consul

添加systemd管理单元:

sudo vim /usr/lib/systemd/system/consul.service
[Unit]
Description=Consul-node1
Documentation=https://www.consul.io/docs/
Wants=network-online.target
After=network-online.target
​
[Service]
User=consul
Group=consul
Type=simple
ExecStart=/data/services/consul/bin/consul agent -dev -config-dir=/data/services/consul/conf >/dev/null 2>&1
​
[Install]
WantedBy=multi-user.target

重载systemd配置:

sudo systemctl daemon-reload

5、Consul服务优雅启动、停止、重启

使用systemd启动consul服务:

sudo systemctl start consul

使用systemd查看consul服务的状态:

sudo systemctl status consul
● consul.service - Consul
 Loaded: loaded (/usr/lib/systemd/system/consul-node1.service; disabled; vendor preset: disabled)
 Active: active (running) since Thu 2020-06-18 15:41:48 CST; 18s ago
 Docs: https://www.consul.io/docs/
 Main PID: 2217 (consul)
 CGroup: /system.slice/consul.service
 └─2217 /data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf
​
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.732+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.733+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
Jun 18 15:41:52 localhost consul[2217]: 2020-06-18T15:41:52.582+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
Hint: Some lines were ellipsized, use -l to show in full.

使用systemd停止consul服务:

sudo systemctl stop consul

使用systemd重启consul服务:

sudo systemctl restart consul

二、启动单节点server

此模式比较适合测试环境、对consul数据持久化有要求的开发环境。如果您需要集群模式,请直接跳过此部分内容。

1、下载安装Consul

使用wget下载安装包至/data/pkgs目录:

cd /data/pkgs
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip

将安装包解压至/data/services目录,并重命名为consul

unzip consul_1.7.3_linux_amd64.zip -d /data/services/consul
 Archive:  consul_1.7.3_linux_amd64.zip
 inflating: /data/services/consul/consul

查看目录信息:

ls -l /data/services/consul
total 105444
-rwxr-xr-x 1 centos centos 107970750 May  6 06:50 consul

2、配置Consul

通过上一步的操作,已经基本完成了consul软件的安装,但为了使用方便,我们还需要将consul可执行文件所在目录加入PATH,以方便在任何地方调用还需要创建配置文件目录来避免启动时需要加很多参数,另外需要将日志写入特定目录,避免consul日志大量写入系统日志。

a.创建相应目录,匹配环境变量

创建bin目录、配置文件目录、日志目录:

cd /data/services/consul
mkdir {bin,log,conf,data}

将consul可执行文件移动到bin目录:

mv consul bin/

将consul的bin目录添加到PATH,以下操作需要sudoroot权限:

sudo vim /etc/profile.d/consul.sh     #为了避免误操作或环境变量配置不当导致系统命令失效,建议以服务或软件名的形式在/etc/profile.d目录下创建环境变量配置脚本,便于维护,不用时移除特定的脚本即可,对系统影响较小
脚本内容如下:
export CONSUL_HOME=/data/services/consul
export PATH=${PATH}:${CONSUL_HOME}/bin
#使用source执行脚本,使配置生效
source /etc/profile.d/consul.sh

至此,consul已经被添加到系统PATH,可以在任意目录下进行调用consul命令。

cd /data/services
consul version
Consul v1.7.3
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

b.创建配置文件

cd /data/services/consul/conf
vim server.json
{
 "bind_addr": "10.100.0.2",
 "client_addr": "10.100.0.2",
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/data",
 "encrypt": "EXz7LFN8hpQ4id8EDYiFoQ==",
 "log_level": "INFO",
 "log_file": "/data/services/consul/log/consul.log",   #配置日志文件与目录
 "log_rotate_duration": "24h",  #设置日志轮转
 "enable_syslog": false,      #禁止consul日志写入系统日志
 "enable_debug": true,
 "node_name": "Consul",
 "server": true,
 "ui": true,
 "bootstrap_expect": 1,  #此处设置为1,标识只需要一个投票即可成为leader,数字改太大会报错,提示集群中没有leader
 "leave_on_terminate": false,
 "skip_leave_on_interrupt": true,
 "rejoin_after_leave": true,
 "retry_join": [ 
 "10.100.0.2:8301"
 ]
}

3、以server模式启动Consul

启动命令与dev模式类似,只需要去掉dev模式中的-dev参数即可,如下:

consul agent -config-dir=/data/services/consul/conf
==> Starting Consul agent...
 Version: 'v1.7.3'
 Node ID: '0e2d44c2-af33-e222-5eb5-58b2c1f903d5'
 Node name: 'Consul'
 Datacenter: 'dc1' (Segment: '<all>')
 Server: true (Bootstrap: false)
 Client Addr: [10.100.0.2] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
 Cluster Addr: 10.100.0.2 (LAN: 8301, WAN: 8302)
 Encrypt: Gossip: true, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
​
==> Log data will now stream in as it occurs:
​
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:0e2d44c2-af33-e222-5eb5-58b2c1f903d5 Address:10.100.0.2:8300}]"
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: entering follower state: follower="Node at 10.100.0.2:8300 [Follower]" leader=
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.serf.wan: serf: EventMemberJoin: Consul.dc1 10.100.0.2
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
 2020-06-18T15:11:48.436+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started DNS server: address=10.100.0.2:8600 network=tcp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started DNS server: address=10.100.0.2:8600 network=udp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started HTTP server: address=10.100.0.2:8500 network=tcp
 2020-06-18T15:11:48.436+0800 [INFO]  agent: Started gRPC server: address=10.100.0.2:8502 network=tcp
 2020-06-18T15:11:48.437+0800 [INFO]  agent: started state syncer
==> Consul agent running!
 2020-06-18T15:11:48.489+0800 [WARN]  agent.server.raft: heartbeat timeout reached, starting election: last-leader=
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: entering candidate state: node="Node at 10.100.0.2:8300 [Candidate]" term=2
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: election won: tally=1
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server.raft: entering leader state: leader="Node at 10.100.0.2:8300 [Leader]"
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server: cluster leadership acquired
 2020-06-18T15:11:48.490+0800 [INFO]  agent.server: New leader elected: payload=Consul
 2020-06-18T15:11:48.501+0800 [INFO]  agent.server.connect: initialized primary datacenter CA with provider: provider=consul
 2020-06-18T15:11:48.501+0800 [INFO]  agent.leader: started routine: routine="CA root pruning"
 2020-06-18T15:11:48.501+0800 [INFO]  agent.server: member joined, marking health alive: member=Consul
 2020-06-18T15:11:48.615+0800 [INFO]  agent: Synced node info

4、配置Consul优雅启动与重启

在上一步的启动操作中,需要使用命令行带参数启动,为了方便管理,将consul服务添加到systemd,可以进行优雅启动与停止。为了服务安全,可以先创建一个不可登录shell类型的用户来对consul服务进行管理:

sudo useradd -M -s /sbin/nologin consul

更改consul服务目录属主:

sudo chown -R consul.consul /data/services/consul

添加systemd管理单元,此处与dev模式配置相似,只需要去除dev模式下启动命令中的-dev参数即可:

sudo vim /usr/lib/systemd/system/consul.service
[Unit]
Description=Consul-node1
Documentation=https://www.consul.io/docs/
Wants=network-online.target
After=network-online.target
​
[Service]
User=consul
Group=consul
Type=simple
ExecStart=/data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf >/dev/null 2>&1
​
[Install]
WantedBy=multi-user.target

重载systemd配置:

sudo systemctl daemon-reload

5、Consul服务优雅启动、停止、重启

使用systemd启动consul服务:

sudo systemctl start consul

使用systemd查看consul服务的状态:

sudo systemctl status consul
● consul.service - Consul
 Loaded: loaded (/usr/lib/systemd/system/consul-node1.service; disabled; vendor preset: disabled)
 Active: active (running) since Thu 2020-06-18 15:41:48 CST; 18s ago
 Docs: https://www.consul.io/docs/
 Main PID: 2217 (consul)
 CGroup: /system.slice/consul.service
 └─2217 /data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf
​
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.732+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.733+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
Jun 18 15:41:52 localhost consul[2217]: 2020-06-18T15:41:52.582+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
Hint: Some lines were ellipsized, use -l to show in full.

使用systemd停止consul服务:

sudo systemctl stop consul

使用systemd重启consul服务:

sudo systemctl restart consul

三、搭建3节点集群

此模式比较适合对服务可靠性要求较高的生产环境,如您并不打算直接用于生产环境或只是学习体验,请跳过此部分内容。此部分也是整个文档中消耗成本比较高的搭建方法。

1、规划与准备

主机规划:

主机用途 主机IP
Consul-server1 10.100.0.2
Consul-server2 10.100.0.3
Consul-server3 10.100.0.4
Consul-agent 10.100.0.5

以上主机均需要在安全组防火墙中配置能互相访问。

2、下载安装Consul

使用wget下载安装包至/data/pkgs目录:

cd /data/pkgs
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip

将安装包解压至/data/services目录,并重命名为consul

unzip consul_1.7.3_linux_amd64.zip -d /data/services/consul
 Archive:  consul_1.7.3_linux_amd64.zip
 inflating: /data/services/consul/consul

查看目录信息:

ls -l /data/services/consul
total 105444
-rwxr-xr-x 1 centos centos 107970750 May  6 06:50 consul

注意:

以上操作需要在每一台机器上操作,每台机器都需要安装Consul服务。

3、配置Consul

通过上一步的操作,已经基本完成了consul软件的安装,但为了使用方便,我们还需要将consul可执行文件所在目录加入PATH,以方便在任何地方调用还需要创建配置文件目录来避免启动时需要加很多参数,另外需要将日志写入特定目录,避免consul日志大量写入系统日志。

a.创建相应目录,匹配环境变量

创建bin目录、配置文件目录、日志目录:

cd /data/services/consul
mkdir {bin,log,conf,data}

将consul可执行文件移动到bin目录

mv consul bin/

将consul的bin目录添加到PATH,以下操作需要sudoroot权限:

sudo vim /etc/profile.d/consul.sh     #为了避免误操作或环境变量配置不当导致系统命令失效,建议以服务或软件名的形式在/etc/profile.d目录下创建环境变量配置脚本,便于维护,不用时移除特定的脚本即可,对系统影响较小
脚本内容如下:
export CONSUL_HOME=/data/services/consul
export PATH=${PATH}:${CONSUL_HOME}/bin
#使用source执行脚本,使配置生效
source /etc/profile.d/consul.sh

至此,consul已经被添加到系统PATH,可以在任意目录下进行调用consul命令。

cd /data/services
consul version
Consul v1.7.3
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

b.创建Consul-Server服务配置文件

cd /data/services/consul/conf
vim server.json
{
 "bind_addr": "10.100.0.2",     #写server所在服务器的IP
 "client_addr": "10.100.0.2",  #写serve所在服务器的IP,或者直接写127.0.0.1,如果写127.0.0.1,就不能直接从外部使用该server提供的客户端访问集群
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/data",
 "encrypt": "EXz7LFN8hpQ4id8EDYiFoQ==",   #此处配置的加密字符串所有节点必须统一,否则通讯会异常
 "log_level": "INFO",
 "log_file": "/data/services/consul/log/consul.log",   #配置日志文件与目录
 "log_rotate_duration": "24h",  #设置日志轮转
 "enable_syslog": false,      #禁止consul日志写入系统日志
 "enable_debug": true,
 "node_name": "Consul",
 "server": true,
 "ui": true,
 "bootstrap_expect": 3, 
 "leave_on_terminate": false,
 "skip_leave_on_interrupt": true,
 "rejoin_after_leave": true,
 "retry_join": [ 
 "10.100.0.2",
 "10.100.0.3",
 "10.100.0.4"
 ]
}

注意:

以上操作仅在server节点配置,agent配置与此处有不同的地方。

c、创建Consul-Agent配置文件

cd /data/services/consul/conf
vim agent.json
{
 "bind_addr": "10.100.0.5",   #此处为服务的监听地址,可以写127.0.0.1
 "client_addr": "10.100.0.5",  #此处写节点的网卡地址,便于外部访问,此IP将会是访问集群的统一入口
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/agent/data",
 "encrypt": "EXz7LFN8hpQ4id8EDYiFoQ==",   #此处加密字符串应当与server端保持一致,不然会导致通讯异常
 "log_level": "INFO",
 "log_file": "/data/services/consul/agent/log/consul.log",
 "log_rotate_duration": "24h",
 "enable_syslog": false,
 "enable_debug": true,
 "node_name": "ConsulClient",
 "ui": true,
 "server": false,
 "rejoin_after_leave": true,
 "retry_join": [
 "10.100.0.2", 
 "10.100.0.3", 
 "10.100.0.4"
 ]
 }

4、启动server节点与agent节点

使用如下命令启动:

consul agent -config-dir=/data/services/consul/conf
==> Starting Consul agent...
 Version: 'v1.7.3'
 Node ID: '0e2d44c2-af33-e222-5eb5-58b2c1f903d5'
 Node name: 'Consul'
 Datacenter: 'dc1' (Segment: '<all>')
 Server: true (Bootstrap: false)
 Client Addr: [10.100.0.2] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
 Cluster Addr: 10.100.0.2 (LAN: 8301, WAN: 8302)
 Encrypt: Gossip: true, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
​
==> Log data will now stream in as it occurs:
​
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:0e2d44c2-af33-e222-5eb5-58b2c1f903d5 Address:10.100.0.2:8300}]"
​
#部分内容因篇幅被删除,仅保留开头部分

注意:

以上操作需要在每一台节点执行。

5、配置Consul优雅启动与重启

在上一步的启动操作中,需要使用命令行带参数启动,为了方便管理,将consul服务添加到systemd,可以进行优雅启动与停止。为了服务安全,可以先创建一个不可登录shell类型的用户来对consul服务进行管理:

sudo useradd -M -s /sbin/nologin consul

更改consul服务目录属主:

sudo chown -R consul.consul /data/services/consul

添加systemd管理单元,此处与dev模式配置相似,只需要去除dev模式下启动命令中的-dev参数即可:

sudo vim /usr/lib/systemd/system/consul.service
[Unit]
Description=Consul-node1
Documentation=https://www.consul.io/docs/
Wants=network-online.target
After=network-online.target
​
[Service]
User=consul
Group=consul
Type=simple
ExecStart=/data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf >/dev/null 2>&1
​
[Install]
WantedBy=multi-user.target

重载systemd配置

sudo systemctl daemon-reload

6、Consul服务优雅启动、停止、重启

使用systemd启动consul服务:

sudo systemctl start consul

使用systemd查看consul服务的状态:

sudo systemctl status consul
● consul.service - Consul
 Loaded: loaded (/usr/lib/systemd/system/consul-node1.service; disabled; vendor preset: disabled)
 Active: active (running) since Thu 2020-06-18 15:41:48 CST; 18s ago
 Docs: https://www.consul.io/docs/
 Main PID: 2217 (consul)
 CGroup: /system.slice/consul.service
 └─2217 /data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf
​
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.732+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.733+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
Jun 18 15:41:52 localhost consul[2217]: 2020-06-18T15:41:52.582+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
Hint: Some lines were ellipsized, use -l to show in full.

使用systemd停止consul服务:

sudo systemctl stop consul

使用systemd重启consul服务:

sudo systemctl restart consul

四、搭建单机3节点集群

在上一步的过程中,搭建了一个3节点集群,但是这种方式需要较多数量的服务器,成本方面来说不太友好。在使用过程中出于成本考虑,需要使用一个3节点集群,但因为在网上没有找到类似的教程来搭建单机3节点的教程,只能查看官方文档中一些配置详解来实现单机3节点,以下是搭建方式。

1、简单规划

节点用途 节点主机IP 节点客户端HTTP端口 节点DNS端口 节点serf_lan端口 节点serf_wan端口 节点server端口
Consul-Server1 10.100.0.2 8501 8601 8001 8002 8000
Consul-Server2 10.100.0.2 8502 8602 8101 8102 8100
Consul-Server3 10.100.0.2 8503 8603 8201 8202 8200
Consul-Agent 10.100.0.2 8500(默认) 8600(默认) - - -

2、下载安装Consul

使用wget下载安装包至/data/pkgs目录:

cd /data/pkgs
wget https://releases.hashicorp.com/consul/1.7.3/consul_1.7.3_linux_amd64.zip

将安装包解压至/data/services目录,并重命名为consul:

unzip consul_1.7.3_linux_amd64.zip -d /data/services/consul
 Archive:  consul_1.7.3_linux_amd64.zip
 inflating: /data/services/consul/consul

查看目录信息:

ls -l /data/services/consul
total 105444
-rwxr-xr-x 1 centos centos 107970750 May  6 06:50 consul

3、配置Consul

a、创建相应目录,配置多节点

创建各节点目录:

cd /data/services/consul
mkdir -p node{1..3}/{bin,conf,data,log}
mkdir -p agent/{bin,conf,data,log}

创建完成后,目录结构大致如下:

tree
.
├── agent
│   ├── bin
│   ├── conf
│   ├── data
│   │   └── serf
│   └── log
├── node1
│   ├── bin
│   ├── conf
│   ├── data
│   │   ├── raft
│   │   │   └── snapshots
│   │   └── serf 
│   └── log
├── node2
│   ├── bin
│   ├── conf
│   ├── data
│   │   ├── raft
│   │   │   └── snapshots
│   │   └── serf 
│   └── log
└── node3
    ├── bin
    ├── conf
    ├── data
    │   ├── raft
    │   │   └── snapshots
    |   └── serf
    └── log

将可执行文件复制到各节点的bin目录:

cd /data/services/consul
cp consul node1/bin/
cp consul node1/bin/
cp consul node1/bin/
cp consul agent/bin/

b、创建Consul-Server服务配置文件

以Server1节点为例:

cd /data/services/consul/node1/conf
vim server.json
{
 "bind_addr": "10.100.0.2", 
 "client_addr": "127.0.0.1",
 "ports": {
 "http": 8501,  #其余server节点需要按照规划的端口进行配置
 "dns": 8601,  #其余server节点需要按照规划的端口进行配置
 "serf_lan": 8001,  #其余server节点需要按照规划的端口进行配置
 "serf_wan": 8002,  #其余server节点需要按照规划的端口进行配置
 "server": 8000  #其余server节点需要按照规划的端口进行配置
 },
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/node1/data",  #此处注意目录名称,写对应server节点的目录名称,如:/data/services/consul/node2/data
 "encrypt": "EXz7LFN8hpQ4id8EDYiFoQ==",  #此处需要与其他节点一致
 "log_level": "INFO",
 "log_file": "/data/services/consul/node1/log/consul.log",  #此处注意目录名称,每个节点目录名称不一样
 "log_rotate_duration": "24h",
 "enable_syslog": false,
 "enable_debug": true,
 "node_name": "ConsulServer1",  #此处需要注意,按照规划的名称填写即可
 "disable_host_node_id": true,  #禁用主机信息生成节点ID
 "server": true,
 "ui": true,
 "bootstrap_expect": 3,
 "leave_on_terminate": false,
 "skip_leave_on_interrupt": true,
 "rejoin_after_leave": true,
 "retry_join": [ 
 "10.100.0.2:8001",
 "10.100.0.2:8101",
 "10.100.0.2:8201"
 ]
}

c、创建Consul-Agent配置文件

cd /data/services/consul/agent/conf
vim agent.json
{
 "bind_addr": "0.0.0.0",
 "client_addr": "0.0.0.0",
 "datacenter": "dc1",
 "data_dir": "/data/services/consul/agent/data",
 "encrypt": "EXz7LFN8hpQ4id8EDYiFoQ==",
 "log_level": "INFO",
 "log_file": "/data/services/consul/agent/log/consul.log",
 "log_rotate_duration": "24h",
 "enable_syslog": false,
 "enable_debug": true,
 "node_name": "ConsulClient",
 "ui": true,
 "disable_host_node_id": true,   #禁用主机信息生成的节点ID
 "server": false,
 "rejoin_after_leave": true,
 "retry_join": [
 "10.100.0.2:8001", 
 "10.100.0.2:8101", 
 "10.100.0.2:8201"
 ]
 }

4、启动Consul节点

以Server1节点为例,使用如下命令启动:

cd /data/services/consul/node1/bin  #此处注意目录,启动相应的节点需要切换到相应的目录
./consul agent -config-dir=/data/services/consul/node1/conf
==> Starting Consul agent...
 Version: 'v1.7.3'
 Node ID: '0e2d44c2-af33-e222-5eb5-58b2c1f903d5'
 Node name: 'Consul'
 Datacenter: 'dc1' (Segment: '<all>')
 Server: true (Bootstrap: false)
 Client Addr: [10.100.0.2] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
 Cluster Addr: 10.100.0.2 (LAN: 8301, WAN: 8302)
 Encrypt: Gossip: true, TLS-Outgoing: false, TLS-Incoming: false, Auto-Encrypt-TLS: false
​
==> Log data will now stream in as it occurs:
​
 2020-06-18T15:11:48.435+0800 [INFO]  agent.server.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:0e2d44c2-af33-e222-5eb5-58b2c1f903d5 Address:10.100.0.2:8300}]"
​
#部分内容因篇幅被删除,仅保留开头部分

注意:

以上操作需要在每一台节点执行,执行的时候注意切换到相应节点的目录。

5、配置Consul优雅启动与重启

在上一步的启动操作中,需要使用命令行带参数启动,为了方便管理,将consul服务添加到systemd,可以进行优雅启动与停止。为了服务安全,可以先创建一个不可登录shell类型的用户来对consul服务进行管理:

sudo useradd -M -s /sbin/nologin consul

更改consul服务目录属主:

sudo chown -R consul.consul /data/services/consul

以Server1节点为例,添加systemd管理单元,此处与dev模式配置相似,只需要去除dev模式下启动命令中的-dev参数即可:

sudo vim /usr/lib/systemd/system/consul-node1.service  #此处注意文件名,如果是agent,则文件名改为consul-agent.service(Server1对应consul-node1.service,Server2对应consul-node2.service,依次类推)
[Unit]
Description=Consul-node1  #服务描述
Documentation=https://www.consul.io/docs/
Wants=network-online.target
After=network-online.target
​
[Service]
User=consul
Group=consul
Type=simple
ExecStart=/data/services/consul/node1/bin/consul agent -config-dir=/data/services/consul/node1/conf >/dev/null 2>&1  #注意此处目录
​
[Install]
WantedBy=multi-user.target

重载systemd配置:

sudo systemctl daemon-reload

注意:

在添加systemd管理单元时需要将创建的文件名与要管理的节点匹配,在此文档中,Server1节点对应的文件名是consul-node1.service,Server2对应的文件名是consul-node2.service,Server3对应的文件名是consul-node3.service,Agent对应的文件名是consul-agent.service。

6、Consul服务的优雅启动、停止与重启

以Server1为例,使用systemd启动consul服务:

sudo systemctl start consul-node1  #如果要启动Serve2,则使用sudo systemctl start consul-node2

使用systemd查看consul服务的状态:

sudo systemctl status consul-node1  #如果要查看Serve2,则使用sudo systemctl status consul-node2
● consul.service - Consul
 Loaded: loaded (/usr/lib/systemd/system/consul-node1.service; disabled; vendor preset: disabled)
 Active: active (running) since Thu 2020-06-18 15:41:48 CST; 18s ago
 Docs: https://www.consul.io/docs/
 Main PID: 2217 (consul)
 CGroup: /system.slice/consul.service
 └─2217 /data/services/consul/bin/consul agent -config-dir=/data/services/consul/conf
​
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.732+0800 [INFO]  agent.server: Handled event for server in area: event=member-join server=Consul.dc1 area=wan
Jun 18 15:41:50 localhost consul[2217]: 2020-06-18T15:41:50.733+0800 [INFO]  agent.server.serf.lan: serf: EventMemberJoin: Consul 10.100.0.2
Jun 18 15:41:52 localhost consul[2217]: 2020-06-18T15:41:52.582+0800 [INFO]  agent.server: Adding LAN server: server="Consul (Addr: tcp/10.100.0.2:8300) (DC: dc1)"
Hint: Some lines were ellipsized, use -l to show in full.

使用systemd停止consul服务:

sudo systemctl stop consul-node1  #如果要停止Serve2,则使用sudo systemctl stop consul-node2

使用systemd重启consul服务:

sudo systemctl restart consul-node1  #如果要重启Serve2,则使用sudo systemctl restart consul-node2

五、踩坑总结

1、单节点启动server时不能选举出leader

针对此问题,需要在配置文件中添加如下参数(案例中已经添加):

并且将该参数的值设置为1,如下:

"bootstrap_expect": 1

2、搭建单机3节点能成功启动,但日志中提示未选举出leader,客户端访问报500

此问题的原因是集群中的node节点都使用了同一个node_id(通过分析日志发现的,在节点的通信中都标识了同一个node_id),但在配置文件中又设定了bootstrap-expect的值为3,此时集群中没有足够的投票选举出leader。针对此问题有两种解决方法。

a、方法一:

修改节点data目录下的node-id文件,以Server1节点为例:

cd /data/services/consul/node1/data

使用tree命令查看,目录结构如下:

tree
.
├── checkpoint-signature
├── node-id
├── raft
│   ├── peers.info
│   ├── raft.db
│   └── snapshots
│       ├── 7-131089-1592693864841
│       │   ├── meta.json
│       │   └── state.bin
│       └── 7-147478-1592808873974
│           ├── meta.json
│           └── state.bin
└── serf 

编辑node-id文件:

vim node-id
6905298b-fd50-6423-2c42-1ddaf123e120

注意:

需要将每个Server节点的id改成唯一的,不可与其他Server节点重复。

b、方法二:

从产生问题的根本原因入手,之所以所有节点会有相同的node_id是因为Consul默认使用服务器的主机硬件信息等经过特定的算法生成一个node_id,因为3个Server节点都部署在同一台主机上,所以其node_id都使用了同一个。

解决此问题需要在服务启动时加入如下参数:

在配置文件中添加该参数,并将其值设置为true,如下:

"disable_host_node_id": true

在本文档中搭建Consul单机3节点集群的配置中,已经加入了该配置。

3、使用agent客户端访问webUI查看节点信息,发现每一个节点在webUI中都被标记为leader

原因:webUI中默认使用IP标记leader,由于我们三个节点都在同一主机上,且只有一张网卡,服务监听的IP都是同一个IP,所以在webUI上显示每个节点都被标记成了leader。不影响使用。

排错与确定问题方法:

在主机的agent节点上执行命令行,通过命令行查看集群信息,命令如下:

cd /data/services/consul/agent/bin
./consul operator raft list-peers
Node           ID                                    Address          State     Voter  RaftProtocol
ConsulServer1  6905298b-fd50-6423-2c42-1ddaf123e120  10.100.0.2:8000  follower  true   3
ConsulServer3  e927bbfa-e067-a84f-93ea-6712cf1db7f8  10.100.0.2:8200  follower  true   3
ConsulServer2  38e5b263-b848-dfb6-d197-115ca2da40e7  10.100.0.2:8100  leader    true   3

通过命令行可以看到,集群中3个节点,只有一个节点是leader

其他说明:

因本人认知范围和技能水平有限,文档中难免存在描述不当或表达有误的地方,如有此类问题,尽请谅解。

上一篇下一篇

猜你喜欢

热点阅读