数据库redis

16. Redis Cluster

2021-03-08  本文已影响0人  随便写写咯

10. Redis Cluster

10.1 Redis Cluster工作原理

在哨兵sentinel机制中,可以解决redis高可用问题,即当master故障后可以自动将slave提升为master,从而可以保证redis服务的正常使用,但是无法解决redis单机写入的瓶颈问题,即单机redis写入性能受限于单机的内存大小、并发数量、网卡速率等因素。

主从解决单点失败, 哨兵实现自动切换, cluster实现多个主节点, 提高性能, 内置主从复制, 也实现了高可用

早期Redis 分布式集群部署方案:

客户端分区:由客户端程序决定key的写入分配和写入的redis 节点(比如对key进行取模运算, 和预先定义好的存放规则做匹配),但是需要客户端自己处理写入分配、高可用管理和故障转移等
代理方案:基于第三方软件实现redis proxy,客户端先连接到代理层,由代理层实现key的写入分配,对客户端来说是比较简单,但是对于集群管理节点增减相对比较麻烦,而且代理本身也是单点并且有性能瓶颈

客户端分区:

图片.png

代理方案:

图片.png

因此redis 3.0版本之后推出了无中心架构的redis cluster机制,在无中心的redis集群当中,其每个节点保存当前节点数据和整个集群状态,每个节点都和其他所有节点连接

Redis Cluster特点如下:

所有Redis节点使用(PING机制)互联
集群中某个节点的数据是否失效,需要整个集群中超过半数的节点监测都失效,才能算真正的失效
客户端不需要proxy即可直接连接redis,应用程序需要写全部的redis服务器IP
redis cluster把所有的redis主节点平均映射到 0-16383个槽位(slot)上,读写需要到指定的redis 节点上进行
操作,因此有多少个redis 节点相当于redis 并发扩展了多少倍,每个redis 节点承担16384/N个槽位
Redis cluster预先分配16384个(slot)槽位,当需要在redis集群中写入一个key -value的时候,会使用
CRC16(key) 模 16384之后的值,决定将key写入值哪一个槽位从而决定写入哪一个Redis节点上,从而有效解决单机瓶颈
cluster中, 每套主从,负责存放不同的数据, 实现了分布式, 提高了redis的性能, 实现了横向扩展
配合主从, 每一份数据都是放在一套主从中, 从主节点写入, 然后同步给自己对应的从节点, 实现数据备份
一旦任何一个主节点宕机, 其对应的从节点会自动被提升为主节点, 无需额外配置
槽位是在主节点分配

10.2 Redis Cluster架构

图片.png

10.2.1 Redis Cluster基本架构

假如三个主节点分别是:A, B, C 三个节点,采用哈希槽 (hash slot)的方式来分配16384个slot 的话

它们三个节点分别承担的slot 区间是:

节点A覆盖 0-5460
节点B覆盖 5461-10922
节点C覆盖 10923-16383

10.2.2 Redis Cluster主从架构

Redis cluster的架构虽然解决了并发的问题,但是又引入了一个新的问题,每个Redis master的高可用如何解决?- 对每个master节点, 都实现主从复制, 从而实现Redis高可用性, cluster会自动监控每个master节点, 一旦master宕机, 其对应的从节点会自动代替主节点成为新的主节点, 该过程无需额外配置, cluster内置功能

10.2.3 cluster部署架构选择

环境A: 节约成本, 测试环境
三台服务器, 每台服务器启动两个redis实例,比如6379和6380
把一套主从的主节点和从节点配置在不同的服务器上, 避免单点失败
图片.png
10.0.0.81:6379/6380
10.0.0.82:6379/6380
10.0.0.83:6379/6380
# 另外预留一台服务器做集群添加节点测试
环境B: 需要硬件投资, 生产环境
六台服务器, 分别是三组主从
图片.png
集群节点
10.0.0.81
10.0.0.82
10.0.0.83
10.0.0.84
10.0.0.85
10.0.0.86
预留服务器扩展使用
10.0.0.87
10.0.0.88

Redis 5.X和之前的版本相比, 有很多变化, 以下介绍版本5.X的配置

10.2.4 部署方式介绍

可以帮助理解cluster架构, 但是生产环境一般不使用, 比较繁琐
高效, 准确, 一般在生产环境使用
实现可视化的自动化部署

实验案例: Redis 5 搭建 cluster

实验环境:

6台redis服务器, 实现三套主从, 每套主从关系会有集群自动设定

要求所有redis节点, 必须没有任何数据

10.0.0.81
10.0.0.82
10.0.0.83
10.0.0.84
10.0.0.85
10.0.0.86

1. 所有节点安装redis, 备份redis配置文件

yum -y install redis
cp  -a /etc/redis.conf /etc/redis.conf.bak

2. 修改配置文件

bind 0.0.0.0
masterauth redis #必须配置, 否则后期master和slave主从无法切换, 还要配置
requirepass redis
cluster-enabled yes #取消此行注释, 必须开启集群模式, 开启后redis进行会显示为 [cluster]
cluster-config-file nodes-6379.conf #取消此行注释, 此为集群状态文件, 记录主从关系及slot范围信息, 由redis cluster集群自动创建和维护
cluster-require-full-coverage no #默认值为yes, 设为no可以防止当一套主从节点全不可用而导致整个cluster不可用. 因为cluster中每套主从都是存放和为维护固定的槽位, 固定的数据, 不能因为一部分槽位不可用就导致整个cluster不可用

利用sed修改配置文件, 在每一台redis节点都要执行

[20:17:37 root@81 ~]#sed -i.bak -e 's/^bind 127.0.0.1/bind 0.0.0.0/' -e '/masterauth/a masterauth redis' -e '/# requirepass/a requirepass redis' -e '/# cluster-enabled yes/a cluster-enabled yes' -e '/cluster-config-file nodes-6379.conf/a cluster-config-file nodes-6379.conf' -e '/cluster-require-full-coverage yes/c cluster-require-full-coverage no' /etc/redis.conf 
[20:19:11 root@81 ~]#diff /etc/redis.conf /etc/redis.conf.bak 
69c69
< bind 0.0.0.0
---
> bind 127.0.0.1
294d293
< masterauth redis
509d507
< requirepass redis
841d838
< cluster-enabled yes
850d846
< cluster-config-file nodes-6379.conf
933c929
< cluster-require-full-coverage no
---
> # cluster-require-full-coverage yes

3. 所有节点启动redis服务

systemctl enable --now redis

验证端口绑定和运行进行

[20:25:25 root@81 ~]#ss -ntlp
State            Recv-Q           Send-Q                     Local Address:Port                      Peer Address:Port                                                              
LISTEN           0                128                              0.0.0.0:6379                           0.0.0.0:*               users:(("redis-server",pid=22806,fd=6))           
LISTEN           0                128                              0.0.0.0:22                             0.0.0.0:*               users:(("sshd",pid=823,fd=3))                     
LISTEN           0                128                              0.0.0.0:16379                          0.0.0.0:*               users:(("redis-server",pid=22806,fd=8))           
LISTEN           0                128                                 [::]:22                                [::]:*               users:(("sshd",pid=823,fd=4))  
redis      22806  0.1  0.5  53520 10060 ?        Ssl  20:24   0:00 /usr/bin/redis-server 0.0.0.0:6379 [cluster]

连接到redis, 查看变化

# Cluster    #开启集群后, 会多一个Cluster字段
cluster_enabled:1

此时, 无法立即写输入, 因为还没有分配slot槽位, 所以数据是不知道往哪个节点存放的

[20:28:06 root@81 ~]#redis-cli -a redis set course linux
(error) CLUSTERDOWN Hash slot not served

4. 创建集群

只需在任意cluster节点执行创建集群的命令即可

该命令会创建集群, 并且自动分配好主从关系, 一般写在前面的节点为主节点, 写在后面的节点为从节点, 并且分配好槽位给每个主节点

[20:34:17 root@81 ~]#redis-cli -a redis --cluster create 10.0.0.81:6379 10.0.0.82:6379 10.0.0.83:6379 10.0.0.84:6379 10.0.0.85:6379 10.0.0.86:6379 --cluster-replicas 1  #1表示, 每个主带一个从
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460 # 给每个master分配槽位
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 10.0.0.84:6379 to 10.0.0.81:6379 # 设定每套主从的主从关系
Adding replica 10.0.0.85:6379 to 10.0.0.82:6379
Adding replica 10.0.0.86:6379 to 10.0.0.83:6379
M: 0599da2e785b53626bb1292b371845d943563c33 10.0.0.81:6379  # 主节点信息和槽位分配
   slots:[0-5460] (5461 slots) master
M: fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 10.0.0.82:6379
   slots:[5461-10922] (5462 slots) master
M: 4ee223748925c7a081746bbf7088eb6c8ae3ba2e 10.0.0.83:6379
   slots:[10923-16383] (5461 slots) master
S: d72390a3603a459a5a7c0cf652a187b73a785802 10.0.0.84:6379 # 从节点信息, 以及指向的主节点
   replicates 0599da2e785b53626bb1292b371845d943563c33 
S: ec0e9c4d56abbc604718ab4f611782e533f1eb8c 10.0.0.85:6379
   replicates fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b
S: 584d6494b4fcb5cdee237a61a3b80cb2fd20ffd5 10.0.0.86:6379
   replicates 4ee223748925c7a081746bbf7088eb6c8ae3ba2e
Can I set the above configuration? (type 'yes' to accept): yes  # 输入yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
......
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 0599da2e785b53626bb1292b371845d943563c33 10.0.0.81:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: d72390a3603a459a5a7c0cf652a187b73a785802 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 0599da2e785b53626bb1292b371845d943563c33
S: ec0e9c4d56abbc604718ab4f611782e533f1eb8c 10.0.0.85:6379
   slots: (0 slots) slave
   replicates fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b
S: 584d6494b4fcb5cdee237a61a3b80cb2fd20ffd5 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 4ee223748925c7a081746bbf7088eb6c8ae3ba2e
M: 4ee223748925c7a081746bbf7088eb6c8ae3ba2e 10.0.0.83:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
M: fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 10.0.0.82:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration. # 节点槽位分配成功
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered. # 节点槽位分配成功

验证cluster配置

[20:37:37 root@81 ~]#redis-cli -a redis
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6379> cluster info
cluster_state:ok # cluster状态
cluster_slots_assigned:16384 # 一共分配了多少个槽位
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6 # cluster一共有多少个节点
cluster_size:3 # cluster有几套主从
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:161
cluster_stats_messages_pong_sent:155
cluster_stats_messages_sent:316
cluster_stats_messages_ping_received:150
cluster_stats_messages_pong_received:161
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:316

查看nodes-6379.conf文件

[20:40:11 root@81 ~]#cat /var/lib/redis/nodes-6379.conf # 每个节点上的文件内容一致, 但是前后行的顺序不一定相同
d72390a3603a459a5a7c0cf652a187b73a785802 10.0.0.84:6379@16379 slave 0599da2e785b53626bb1292b371845d943563c33 0 1615184835000 4 connected
ec0e9c4d56abbc604718ab4f611782e533f1eb8c 10.0.0.85:6379@16379 slave fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 0 1615184838000 5 connected
584d6494b4fcb5cdee237a61a3b80cb2fd20ffd5 10.0.0.86:6379@16379 slave 4ee223748925c7a081746bbf7088eb6c8ae3ba2e 0 1615184837000 6 connected
4ee223748925c7a081746bbf7088eb6c8ae3ba2e 10.0.0.83:6379@16379 master - 0 1615184836000 3 connected 10923-16383
fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 10.0.0.82:6379@16379 master - 0 1615184838000 2 connected 5461-10922
0599da2e785b53626bb1292b371845d943563c33 10.0.0.81:6379@16379 myself,master - 0 1615184837000 1 connected 0-5460  # myself表示在本机查看的该文件, 后面的master是表示主节点
vars currentEpoch 6 lastVoteEpoch 0
到这里, 集群创建完毕, 主从关系确定, 槽位分配完毕

主节点         ||||         从节点
10.0.0.81                 10.0.0.84
10.0.0.82                 10.0.0.85
10.0.0.83                 10.0.0.86

5. 连接redis创建数据

此时, 由于搭建了集群, 因此, 每次插入数据时, redis会计算keyhash值, 与key对应的value无关, 根据结果, 得到属于的槽位, 存到相应的节点上, 因此, 如果连接到节点A执行添加数据, 但是经过redis计算, 该数据属于的槽位是在B节点上, 那么添加数据时, A就会返还客户端消息, 告诉redis客户端要添加的数据属于的槽位在B节点上. 之后客户端会直接向B重新发送命令. 该过程由redis内部自动完成.

注意: redis会告诉redis客户端去哪存, 但是不会帮助客户端到正确的节点存数据, 客户端需要重新连到正确的节点去存数据.

图片.png
127.0.0.1:6379> set key value
(error) MOVED 12539 10.0.0.83:6379  #redis经过计算, 发现key的哈希值应该在slot-12539, 而12539在10.0.0.83上. 因此会告诉客户端去向10.0.0.83发送命令存数据. 之后, 客户端要重新向10.0.0.83发起连接, 并且set数据, 这一步(error)MOVED并不会把数据key和value写到10.0.0.83中

另外, 如果在A机器上查一个属于B机器slot的key, 那么查询时, redis也会返还给客户端, 这个key属于哪个节点的哪个slot. 但是, 这里是基于A要查询的key的slot位置来告诉客户端, 但是在目标的redis节点, 不一定有这些key.

比如在A节点查看name, A节点经过计算, 发现name属于的槽位应该在B节点, 那么A就会告诉客户端去连接到B节点查看name的值, 但是此时, B节点并不一定存在name这个key. 所以, A只是告诉客户端某个key应该是在哪个节点上, 但自己并不知道这个key是否真的存在

6. 利用-c集群模式操作redis

-c: enable cluster mode (follow -ASK and -MOVED redirections)

使用-c选项, redis会直接帮客户端完成查找或者添加的数据操作, 不会只是提醒正确的slots槽位在哪个redis节点上.

[21:56:34 root@83 ~]#redis-cli -a redis -h 10.0.0.82 set key2 haha

(error) MOVED 4998 10.0.0.81:6379

[21:56:35 root@83 ~]#redis-cli -a redis -h 10.0.0.81 get key2  # 默认情况下, 并不会负责帮助客户端完成数据的写入
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
(nil)

[21:56:40 root@83 ~]#redis-cli -c -a redis -h 10.0.0.82 set key2 haha
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
OK
[21:56:55 root@83 ~]#redis-cli -a redis -h 10.0.0.81 get key2 
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
"haha"

注意: 集群cluser是不支持keys *的, 只能查到本机redis上所有的key, 无法查询整个集群的key

7. 利用python脚本实现集群数据导入

yum -y install python3
pip3 install --upgrade pip
pip3 install redis-py-cluster
vim redis_cluster_test.py
#! /usr/bin/env python3
from rediscluster import RedisCluster
startup_nodes = [
            {"host":"10.0.0.81","port":6379},
            {"host":"10.0.0.82","port":6379},
            {"host":"10.0.0.83","port":6379},
            {"host":"10.0.0.84","port":6379},
            {"host":"10.0.0.85","port":6379},
            {"host":"10.0.0.86","port":6379}
]
redis_conn = RedisCluster(startup_nodes = startup_nodes, password = 'redis', decode_responses = True)

for i in range(0,10000):
    redis_conn.set('key'+str(i), 'value'+str(i))
    print('key'+str(i)+':', redis_conn.get('key'+str(i)))   
chmod +x redis_cluster_test.py 
./redis_cluster_test.py
...
key9996: value9996
key9997: value9997
key9998: value9998
key9999: value9999
这10000个key会被按照槽点, 平均分配到3个主节点上, 并且同步到各自的从节点
主: 10.0.0.83
# Keyspace
db0:keys=3329,expires=0,avg_ttl=0

从: 10.0.0.86
# Keyspace
db0:keys=3329,expires=0,avg_ttl=0

8. 测试故障切换

目前10.0.0.81是主节点, 10.0.0.84是其从节点, 构成一套主从

现在, 停止10.0.0.81上的redis服务, 观察是否会提升10.0.0.84为主节点

10.0.0.81

systemctl stop redis

观察10.0.0.84上redis日志, 可以看到10.0.0.84提升为了新的主节点

停止10.0.0.81上redis服务器后, 10.0.0.84提升为主节点的整个日志
可以看到, 从节点经过若干次的尝试连接后, 默认15秒超时时间,其余主节点会通知该从节点其主已经宕机 
1777:S 08 Mar 2021 15:51:36.237 # Connection with master lost.
1777:S 08 Mar 2021 15:51:36.237 * Caching the disconnected master state.
1777:S 08 Mar 2021 15:51:36.788 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:36.788 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:36.790 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:37.807 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:37.807 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:37.807 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:38.828 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:38.828 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:38.828 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:39.857 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:39.858 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:39.860 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:40.887 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:40.888 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:40.889 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:41.911 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:41.912 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:41.912 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:42.941 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:42.943 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:42.945 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:43.967 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:43.969 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:43.970 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:45.004 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:45.004 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:45.005 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:46.022 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:46.024 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:46.024 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:47.045 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:47.046 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:47.046 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:48.069 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:48.072 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:48.073 # Error condition on socket for SYNC: Connection refused
c1777:S 08 Mar 2021 15:51:49.102 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:49.103 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:49.103 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:50.121 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:50.122 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:50.122 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:51.149 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:51.149 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:51.149 # Error condition on socket for SYNC: Connection refused
^C1777:S 08 Mar 2021 15:51:52.164 * Connecting to MASTER 10.0.0.81:6379
1777:S 08 Mar 2021 15:51:52.164 * MASTER <-> REPLICA sync started
1777:S 08 Mar 2021 15:51:52.165 # Error condition on socket for SYNC: Connection refused
1777:S 08 Mar 2021 15:51:52.422 * FAIL message received from fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b(10.0.0.82的masterid) about 0599da2e785b53626bb1292b371845d943563c33(10.0.0.81的masterid), 说明主节点宕机, 会由其余主节点发现
1777:S 08 Mar 2021 15:51:52.469 # Start of election delayed for 505 milliseconds (rank #0, offset 142833). # 开启选举
1777:S 08 Mar 2021 15:51:52.977 # Starting a failover election for epoch 7.
1777:S 08 Mar 2021 15:51:53.015 # Failover election won: I'm the new master. # 此从节点选举为新的主节点
1777:S 08 Mar 2021 15:51:53.015 # configEpoch set to 7 after successful failover
1777:M 08 Mar 2021 15:51:53.015 # Setting secondary replication ID to eb61dac029ac9a57d5f2396f91cac13529764b92, valid up to offset: 142834. New replication ID is 71a7f74e8666f718ed8f0503d63989ab61edc59d
1777:M 08 Mar 2021 15:51:53.015 * Discarding previously cached master state. # 此从节点清除此前的主节点信息缓存

查看10.0.0.82上的日志

[15:46:05 root@CentOS-8-2 ~]#tail /var/log/redis/redis.log
2157:C 08 Mar 2021 15:37:08.163 * DB saved on disk
2157:C 08 Mar 2021 15:37:08.164 * RDB: 4 MB of memory used by copy-on-write
2102:M 08 Mar 2021 15:37:08.236 * Background saving terminated with success
2102:M 08 Mar 2021 15:42:09.046 * 10 changes in 300 seconds. Saving...
2102:M 08 Mar 2021 15:42:09.047 * Background saving started by pid 2159
2159:C 08 Mar 2021 15:42:09.050 * DB saved on disk
2159:C 08 Mar 2021 15:42:09.050 * RDB: 2 MB of memory used by copy-on-write
2102:M 08 Mar 2021 15:42:09.152 * Background saving terminated with success
2102:M 08 Mar 2021 15:51:53.310 * Marking node 0599da2e785b53626bb1292b371845d943563c33 as failing (quorum reached). # 其余主节点会把故障主节点标记为failling, 并且为故障主从提升新的主节点
2102:M 08 Mar 2021 15:51:53.874 # Failover auth granted to d72390a3603a459a5a7c0cf652a187b73a785802 for epoch 7

查看10.0.0.83上的日志

[15:46:05 root@CentOS-8-3 ~]#tail /var/log/redis/redis.log
2163:C 08 Mar 2021 15:37:08.103 * DB saved on disk
2163:C 08 Mar 2021 15:37:08.103 * RDB: 4 MB of memory used by copy-on-write
2102:M 08 Mar 2021 15:37:08.196 * Background saving terminated with success
2102:M 08 Mar 2021 15:42:09.054 * 10 changes in 300 seconds. Saving...
2102:M 08 Mar 2021 15:42:09.055 * Background saving started by pid 2165
2165:C 08 Mar 2021 15:42:09.065 * DB saved on disk
2165:C 08 Mar 2021 15:42:09.065 * RDB: 2 MB of memory used by copy-on-write
2102:M 08 Mar 2021 15:42:09.157 * Background saving terminated with success
2102:M 08 Mar 2021 15:51:53.301 * Marking node 0599da2e785b53626bb1292b371845d943563c33 as failing (quorum reached).
2102:M 08 Mar 2021 15:51:53.861 # Failover auth granted to d72390a3603a459a5a7c0cf652a187b73a785802 for epoch 7

查看10.0.0.85上的日志

[15:46:04 root@CentOS-8-5 ~]#tail /var/log/redis/redis.log 
1785:S 08 Mar 2021 15:37:07.449 * Background saving started by pid 1841
1841:C 08 Mar 2021 15:37:07.502 * DB saved on disk
1841:C 08 Mar 2021 15:37:07.506 * RDB: 4 MB of memory used by copy-on-write
1785:S 08 Mar 2021 15:37:07.567 * Background saving terminated with success
1785:S 08 Mar 2021 15:42:08.012 * 10 changes in 300 seconds. Saving...
1785:S 08 Mar 2021 15:42:08.013 * Background saving started by pid 1843
1843:C 08 Mar 2021 15:42:08.019 * DB saved on disk
1843:C 08 Mar 2021 15:42:08.019 * RDB: 2 MB of memory used by copy-on-write
1785:S 08 Mar 2021 15:42:08.117 * Background saving terminated with success
1785:S 08 Mar 2021 15:51:52.505 * FAIL message received from fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b about 0599da2e785b53626bb1292b371845d943563c33 # 其余从节点也会受到某个主节点宕机的信息

查看node文件

10.0.0.81因为已经停止服务, 所以node文件不会跟新

[15:51:36 root@CentOS-8-1 ~]#cat /var/lib/redis/nodes-6379.conf 
d72390a3603a459a5a7c0cf652a187b73a785802 10.0.0.84:6379@16379 slave 0599da2e785b53626bb1292b371845d943563c33 0 1615184835000 4 connected
ec0e9c4d56abbc604718ab4f611782e533f1eb8c 10.0.0.85:6379@16379 slave fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 0 1615184838000 5 connected
584d6494b4fcb5cdee237a61a3b80cb2fd20ffd5 10.0.0.86:6379@16379 slave 4ee223748925c7a081746bbf7088eb6c8ae3ba2e 0 1615184837000 6 connected
4ee223748925c7a081746bbf7088eb6c8ae3ba2e 10.0.0.83:6379@16379 master - 0 1615184836000 3 connected 10923-16383
fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 10.0.0.82:6379@16379 master - 0 1615184838000 2 connected 5461-10922
0599da2e785b53626bb1292b371845d943563c33 10.0.0.81:6379@16379 myself,master - 0 1615184837000 1 connected 0-5460
vars currentEpoch 6 lastVoteEpoch 0
其余所有节点的node文件都会显示10.0.0.81故障

[15:58:44 root@CentOS-8-5 ~]#cat /var/lib/redis/nodes-6379.conf 
ec0e9c4d56abbc604718ab4f611782e533f1eb8c 10.0.0.85:6379@16379 myself,slave fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 0 1615189908000 5 connected
0599da2e785b53626bb1292b371845d943563c33 10.0.0.81:6379@16379 master,fail - 1615189896412 1615189893000 1 disconnected
4ee223748925c7a081746bbf7088eb6c8ae3ba2e 10.0.0.83:6379@16379 master - 0 1615189911388 3 connected 10923-16383
d72390a3603a459a5a7c0cf652a187b73a785802 10.0.0.84:6379@16379 master - 0 1615189912408 7 connected 0-5460
fc9198a6ce8e06dd78f97a766bb067a0d2d43d7b 10.0.0.82:6379@16379 master - 0 1615189912000 2 connected 5461-10922
584d6494b4fcb5cdee237a61a3b80cb2fd20ffd5 10.0.0.86:6379@16379 slave 4ee223748925c7a081746bbf7088eb6c8ae3ba2e 0 1615189911000 6 connected
vars currentEpoch 7 lastVoteEpoch 0

恢复10.0.0.81 redis服务

10.0.0.81

systemctl start redis

10.0.0.81观察redis日志, 可以看到其会被自动加入到集群中, 以10.0.0.84为其主节点

2674:M 08 Mar 2021 15:59:42.789 * Ready to accept connections
2674:M 08 Mar 2021 15:59:42.796 # Configuration change detected. Reconfiguring myself as a replica of d72390a3603a459a5a7c0cf652a187b73a785802
2674:S 08 Mar 2021 15:59:42.796 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
2674:S 08 Mar 2021 15:59:42.796 # Cluster state changed: ok
2674:S 08 Mar 2021 15:59:43.813 * Connecting to MASTER 10.0.0.84:6379 # 以10.0.0l.84为主节点
2674:S 08 Mar 2021 15:59:43.813 * MASTER <-> REPLICA sync started
2674:S 08 Mar 2021 15:59:43.813 * Non blocking connect for SYNC fired the event.
2674:S 08 Mar 2021 15:59:43.814 * Master replied to PING, replication can continue...
2674:S 08 Mar 2021 15:59:43.815 * Trying a partial resynchronization (request e45c9c57b27adf0c536f056286e3ed099569b2d2:1).
2674:S 08 Mar 2021 15:59:43.817 * Full resync from master: 71a7f74e8666f718ed8f0503d63989ab61edc59d:142833 # 重新加入到cluster后会进行全量复制, 因为故障的主节点原先记录的master_replid是它自己的, 因此发送给10.0.0.84新的主节点后, 会不匹配
2674:S 08 Mar 2021 15:59:43.817 * Discarding previously cached master state.
2674:S 08 Mar 2021 15:59:43.863 * MASTER <-> REPLICA sync: receiving 62735 bytes from master
2674:S 08 Mar 2021 15:59:43.877 * MASTER <-> REPLICA sync: Flushing old data
2674:S 08 Mar 2021 15:59:43.882 * MASTER <-> REPLICA sync: Loading DB in memory
2674:S 08 Mar 2021 15:59:43.888 * MASTER <-> REPLICA sync: Finished with success

10.0.0.84观察redis日志, 可以看到10.0.0.81在服务恢复后成为了84的从节点

1777:M 08 Mar 2021 15:59:42.729 * Clear FAIL state for node 0599da2e785b53626bb1292b371845d943563c33: master without slots is reachable again. # 清除fail信息
1777:M 08 Mar 2021 15:59:43.702 * Replica 10.0.0.81:6379 asks for synchronization
1777:M 08 Mar 2021 15:59:43.702 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'e45c9c57b27adf0c536f056286e3ed099569b2d2', my replication IDs are '71a7f74e8666f718ed8f0503d63989ab61edc59d' and 'eb61dac029ac9a57d5f2396f91cac13529764b92')
1777:M 08 Mar 2021 15:59:43.702 * Starting BGSAVE for SYNC with target: disk
1777:M 08 Mar 2021 15:59:43.704 * Background saving started by pid 1861
1861:C 08 Mar 2021 15:59:43.708 * DB saved on disk
1861:C 08 Mar 2021 15:59:43.709 * RDB: 2 MB of memory used by copy-on-write
1777:M 08 Mar 2021 15:59:43.749 * Background saving terminated with success
1777:M 08 Mar 2021 15:59:43.762 * Synchronization with replica 10.0.0.81:6379 succeeded

再次停止10.0.0.84上redis服务, 使其成为10.0.0.81从节点, 恢复初始主从关系

10.0.0.84

systemctl stop redis
手动切换主节点时, 不要直接重启, 或者关闭后马上启动redis, 要等待超时时间到达, 集群完成新的主节点提升, 在开启故障的redis节点服务
systemctl start redis

注意:

在集群模式下, 所有的读写操作都是发生在主节点, 主节点写入数据后, 会同步给从节点. 在没有实现读写分离的情况下, 从节点只起到冗余备份作用.

在集群模式下, 在从节点想要读写数据,都会被重定向到主节点. 可以在从节点看到所有的key, 但是想读的话, 就会被重定向的主节点.

10.0.0.86上执行keys *可以看到本从节点存放的key
127.0.0.1:6379> keys * 
...
3327) "key6300"
3328) "key3461"
3329) "key8906"

但是直接在从节点读数据, 会被重定向到其主节点
127.0.0.1:6379> get key8906
(error) MOVED 12488 10.0.0.83:6379

可以指定-c集群模式, 完成自动的读取和写入
redis-cli -a redis -c get key8906
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
"value8906"

在从节点执行写操作, 也会被重定向到对应的槽位的主节点
[15:49:37 root@CentOS-8-6 ~]#redis-cli -a redis set haha lala
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
(error) MOVED 3662 10.0.0.81:6379

在主从或者主从+哨兵模式下, 可以直接在从节点读数据.

9. cluster相关配置参数

cluster-enabled yes #是否开启集群模式,默认是单机模式

cluster-config-file nodes-6379.conf #由node节点自动生成的集群配置文件名称

cluster-node-timeout 15000 #集群中node节点连接超时时间,超过此时间,会踢出集群, 默认15秒, 从节点在15秒内无法连接主节点, 那么其余主节点会通知该从节点,其主节点已宕机, 之后该从节点会被提升为主节点

cluster-replica-validity-factor 10 #在执行故障转移的时候可能有些节点和master断开一段时间数据比较旧,这些节点就不适用于选举为master,超过这个时间的就不会被进行故障转移,计算公式:(node-timeout * replica-validity-factor) + repl-ping-replica-period 

cluster-migration-barrier 1 #集群迁移屏障,一个主节点至少拥有一个正常工作的从节点,即如果主节点的slave节点故障后会将多余的从节点分配到当前主节点成为其新的从节点。

cluster-require-full-coverage yes #集群请求槽位全部覆盖,如果一个主库宕机且没有备库就会出现集群槽位不全,那么yes情况下redis集群槽位验证不全就不再对外提供服务,而no则可以继续使用但是会出现查询数据查不到的情况(因为有数据丢失)。建议为no

cluster-replica-no-failover no #如果为yes,此选项阻止在主服务器发生故障时尝试对其主服务器进行故障转移。 但是,主服务器仍然可以执行手动强制故障转移,一般为no

10. 一套主从全部宕机的结果

cluster中, 如果一套主从的所有节点都宕机, 那么原本由该主从负责存入的数据槽位, 会被重定向到其他的主从架构, 不影响新的数据的写操作, 但是该主从现有的数据就没法查看了, 无法读取已存的数据, 影响读操作

10.3 Redis Cluster 集群节点维护

集群运行时间长久之后,难免由于硬件故障、网络规划、业务增长等原因对已有集群进行相应的调整, 比如增加Redis节点、减少节点、节点迁移、更换服务器等

无论是增加还是减少主从数量, 最后都要确保是有奇数个主从架构, 因为节点之间要进行选举, 防止出现偶数平票情况的脑裂现象

增加节点和删除节点会涉及到已有的槽位重新分配及数据迁移

10.3.1 Redis Cluster 集群节点维护之动态扩容

这里只增加一套主从, 因为电脑内存不够了...

10.0.0.87为主, 10.0.0.88为从

  1. 配置新的节点设备
#配置node7节点, 10.0.0.87
[root@redis-node7 ~]#dnf -y  install redis
[root@redis-node7 ~]#sed -i.bak -e 's/bind 127.0.0.1/bind 0.0.0.0/' -e '/masterauth/a masterauth redis' -e '/# requirepass/a requirepass redis' -e '/# cluster-enabled yes/a cluster-enabled yes' -e '/# cluster-config-file nodes-6379.conf/a cluster-config-file nodes-6379.conf' -e '/appendonly no/c appendonly yes' /etc/redis.conf
[root@redis-node7 ~]#systemctl start redis

#配置node8节点, 10.0.0.88
[root@redis-node8 ~]#dnf -y  install redis
[root@redis-node8 ~]#sed -i.bak -e 's/bind 127.0.0.1/bind 0.0.0.0/' -e '/masterauth/a masterauth redis' -e '/# requirepass/a requirepass redis' -e '/# cluster-enabled yes/a cluster-enabled yes' -e '/# cluster-config-file nodes-6379.conf/a cluster-config-file nodes-6379.conf' -e '/appendonly no/c appendonly yes' /etc/redis.conf
[root@redis-node8 ~]#systemctl start redis
分别启动redis
systemctl enable --now redis
  1. 添加新的master(node7-10.0.0.87)到集群

使用以下命令添加新节点,要添加新redis节点IP和端口到已有的集群中任意节点的IP:端口

add-node new_host:new_port existing_host:existing_port

说明:
new_host:new_port               #为新添加的主机的IP和端口
existing_host:existing_port     #为已有的集群中任意节点的IP和端口

Redis 5 添加方式

#将一台新的主机10.0.0.87加入集群, 以下示例中的10.0.0.85可以是任意存在的集群节点, 是主还是从无所谓
[23:13:05 root@81 ~]#redis-cli -a redis --cluster add-node 10.0.0.87:6379 10.0.0.85:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 10.0.0.87:6379 to cluster 10.0.0.85:6379
>>> Performing Cluster Check (using node 10.0.0.85:6379)
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 10.0.0.87:6379 to make it join the cluster.
[OK] New node added correctly.

#观察到10.0.0.87已经加入成功,但没有slot位,而且新的主机是master
[23:21:55 root@81 ~]#redis-cli -a redis --cluster info 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 3331 keys | 5461 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 0 keys | 0 slots | 0 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3329 keys | 5461 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 3340 keys | 5462 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.

[23:22:08 root@81 ~]#redis-cli -a redis --cluster check  10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 3331 keys | 5461 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 0 keys | 0 slots | 0 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3329 keys | 5461 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 3340 keys | 5462 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots: (0 slots) master
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

[23:22:58 root@81 ~]#cat /var/lib/redis/nodes-6379.conf 
7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379@16379 master - 0 1621524090262 0 connected
6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379@16379 myself,master - 0 1621524089000 8 connected 0-5460
bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379@16379 slave 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 0 1621524088000 6 connected
34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379@16379 slave 104844ad87a3e145884b520980cedc70232986f7 0 1621524090259 5 connected
1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379@16379 master - 0 1621524088208 3 connected 10923-16383
609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379@16379 slave 6a789b9a447c400581df3c63071104f16032f2c6 0 1621524089233 8 connected
104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379@16379 master - 0 1621524089000 2 connected 5461-10922
vars currentEpoch 8 lastVoteEpoch 0


#和上面显示结果一样
[23:23:28 root@81 ~]#redis-cli -a redis CLUSTER NODES
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379@16379 master - 0 1621524261243 0 connected
6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379@16379 myself,master - 0 1621524260000 8 connected 0-5460
bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379@16379 slave 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 0 1621524261000 6 connected
34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379@16379 slave 104844ad87a3e145884b520980cedc70232986f7 0 1621524261000 5 connected
1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379@16379 master - 0 1621524259000 3 connected 10923-16383
609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379@16379 slave 6a789b9a447c400581df3c63071104f16032f2c6 0 1621524263291 8 connected
104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379@16379 master - 0 1621524262259 2 connected 5461-10922


#查看集群状态
[23:24:23 root@81 ~]#redis-cli -a redis CLUSTER INFO
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:7
cluster_size:3
cluster_current_epoch:8
cluster_my_epoch:8
cluster_stats_messages_ping_sent:1927
cluster_stats_messages_pong_sent:847
cluster_stats_messages_auth-req_sent:5
cluster_stats_messages_update_sent:2
cluster_stats_messages_sent:2781
cluster_stats_messages_ping_received:842
cluster_stats_messages_pong_received:845
cluster_stats_messages_meet_received:1
cluster_stats_messages_fail_received:2
cluster_stats_messages_auth-ack_received:2
cluster_stats_messages_received:1692

  1. 在新的master-10.0.0.87上重新分配槽位

新的node节点加到集群之后, 默认是master节点, 但是没有槽位, 需要重新分配

添加主机之后需要对添加至集群中的新主机重新分配槽位, 否则其没有槽位也就无法写入数据

重新分配槽位,需要清空数据, 所以需要先备份数据, 扩容完毕在恢复数据. 需要让开发把数据导出来, 扩容后再导入, 因为rdb和aof只保存的是一个节点的数据, 而cluster集群所有的数据会分布在每个节点, 并且是按照槽位分配的, 因此不能像传统的rdb和aof一样导出导入

此外, 增加了槽位后, 原本master节点的原有槽位以及存储的数据会被均分到新的master节点, 造成数据迁移

Redis 5:

[23:26:00 root@81 ~]#redis-cli -a redis --cluster reshard  10.0.0.87:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing Cluster Check (using node 10.0.0.87:6379)
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots: (0 slots) master
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)?4096  #新分配多少个槽位=16384/master个数
What is the receiving node ID? 7ccff9f724e4508303a8184df5b0903789979759 #新的master,10.0.0.87的ID
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: all # 将哪些源主机的槽位分配给新的节点,all是自动在所有的redis 节点选择划分,如果是从redis cluster删除某个主机可以使用此方式指定某个节点, 将指定主机上的槽位全部移动到别的redis主机
......
Do you want to proceed with the proposed reshard plan (yes/no)?  yes #确认分配
......
Moving slot 5606 from 10.0.0.82:6379 to 10.0.0.87:6379: 
Moving slot 5607 from 10.0.0.82:6379 to 10.0.0.87:6379: ..
Moving slot 5608 from 10.0.0.82:6379 to 10.0.0.87:6379: 
Moving slot 5609 from 10.0.0.82:6379 to 10.0.0.87:6379: 
Moving slot 5610 from 10.0.0.82:6379 to 10.0.0.87:6379: .
Moving slot 5611 from 10.0.0.82:6379 to 10.0.0.87:6379: ..
Moving slot 5612 from 10.0.0.82:6379 to 10.0.0.87:6379: 

...


#确定slot分配成功
[23:31:35 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 2511 keys | 4096 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 2474 keys | 4096 slots | 0 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 2500 keys | 4096 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 2515 keys | 4096 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master # 可以看到分配了4096个槽位, 虽然不是连续的但没关系
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

  1. 为新的master添加新的slave节点, 10.0.0.88

需要再向当前Redis集群中添加一个Redis单机服务器10.0.0.88, 用于解决当前10.0.0.87单机的潜在宕机问题, 即实现高可用功能

在新加节点到集群时,直接将之设置为slave

Redis 5

#查看当前状态
[23:31:41 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 2511 keys | 4096 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 2474 keys | 4096 slots | 0 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 2500 keys | 4096 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 2515 keys | 4096 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.


#直接加为slave节点
[23:33:04 root@81 ~]#redis-cli -a redis --cluster add-node 10.0.0.88:6379 10.0.0.81:6379 --cluster-slave  --cluster-master-id 7ccff9f724e4508303a8184df5b0903789979759 # 这里只需指明新添加的从节点的主节点id
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Adding node 10.0.0.88:6379 to cluster 10.0.0.81:6379
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 10.0.0.88:6379 to make it join the cluster.
Waiting for the cluster to join

>>> Configure node as replica of 10.0.0.87:6379.
[OK] New node added correctly.


#验证是否成功
[23:33:25 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 2511 keys | 4096 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 2474 keys | 4096 slots | 1 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 2500 keys | 4096 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 2515 keys | 4096 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

10.3.2 Redis Cluster 集群节点维护之动态缩容

删除节点过程:

添加节点的时候是先添加node节点到集群,然后分配槽位,删除节点的操作与添加节点的操作正好相反,是先将要被删除的Redis 节点上的槽位迁移到集群中的其他Redis节点上,然后再将其删除

如果一个Redis节点上的槽位没有被完全迁移,删除该node的时候会提示有数据且无法删除

Redis 5版本

删除主节点10.0.0.81和对应的从节点10.0.0.84

  1. 迁移10.0.0.81-master的槽位至其他的master节点
被迁移Redis master原服务器必须保证没有数据
#查看当前状态
[23:34:06 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 2511 keys | 4096 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 2474 keys | 4096 slots | 1 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 2500 keys | 4096 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 2515 keys | 4096 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.


#连接到任意集群节点,#前1365个slot从10.0.0.81移动到第二个master节点10.0.0.83上
[23:38:29 root@81 ~]#redis-cli -a redis --cluster reshard 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots:[1365-5460] (4096 slots) master
   1 additional replica(s)
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[12288-16383] (4096 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 6a789b9a447c400581df3c63071104f16032f2c6
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[6827-10922] (4096 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1365 #共4096/3分别给其它三个master节点
What is the receiving node ID? 1e4c9a3e6fe5fce70617bf417ef76463a33e334f # master 10.0.0.83
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
Source node #1: 6a789b9a447c400581df3c63071104f16032f2c6 #输入要删除10.0.0.81节点ID
Source node #2: done

Ready to move 1356 slots.
  Source nodes:
    M: cb028b83f9dc463d732f6e76ca6bbcd469d948a7 10.0.0.81:6379
       slots:[1365-5460] (4096 slots) master
       1 additional replica(s)
  Destination node:
    M: d34da8666a6f587283a1c2fca5d13691407f9462 10.0.0.83:6379
       slots:[12288-16383] (4096 slots) master
       1 additional replica(s)
  Resharding plan:
    Moving slot 1365 from cb028b83f9dc463d732f6e76ca6bbcd469d948a7
......
    Moving slot 2719 from cb028b83f9dc463d732f6e76ca6bbcd469d948a7
    Moving slot 2720 from cb028b83f9dc463d732f6e76ca6bbcd469d948a7
Do you want to proceed with the proposed reshard plan (yes/no)? yes #确定
......
Moving slot 2718 from 10.0.0.81:6379 to 10.0.0.83:6379: ..
Moving slot 2719 from 10.0.0.81:6379 to 10.0.0.83:6379: .
Moving slot 2720 from 10.0.0.81:6379 to 10.0.0.83:6379: ..

#非交互式方式
#再将1365个slot从10.0.0.81移动到第一个master节点10.0.0.82上
[23:46:42 root@81 ~]#redis-cli -a redis --cluster reshard 10.0.0.81:6379 --cluster-slots 1365 --cluster-from 6a789b9a447c400581df3c63071104f16032f2c6 --cluster-to 104844ad87a3e145884b520980cedc70232986f7 --cluster-yes

#最后的slot从10.0.0.81移动到第三个master节点10.0.0.87上
[23:46:42 root@81 ~]#redis-cli -a redis --cluster reshard 10.0.0.81:6379 --cluster-slots 1366 --cluster-from 6a789b9a447c400581df3c63071104f16032f2c6 --cluster-to 7ccff9f724e4508303a8184df5b0903789979759 --cluster-yes


#确认10.0.0.81的所有slot都移走了,上面的slave也自动删除,成为其它master的slave 
[23:48:15 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.81:6379
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.81:6379 (6a789b9a...) -> 0 keys | 0 slots | 0 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 3306 keys | 5462 slots | 2 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3341 keys | 5461 slots | 1 slaves.
10.0.0.82:6379 (104844ad...) -> 3353 keys | 5461 slots | 1 slaves.
[OK] 10000 keys in 4 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.81:6379)
M: 6a789b9a447c400581df3c63071104f16032f2c6 10.0.0.81:6379
   slots: (0 slots) master
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[4095-6826],[10923-12287] (5462 slots) master
   2 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[1365-2729],[12288-16383] (5461 slots) master
   1 additional replica(s)
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[2730-4094],[6827-10922] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.


#原有的10.0.0.84自动成为10.0.0.87的slave
[23:48:30 root@81 ~]#redis-cli -a redis  -h 10.0.0.87 INFO replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.88,port=6379,state=online,offset=61891,lag=0
slave1:ip=10.0.0.84,port=6379,state=online,offset=61891,lag=0
master_replid:b1403e0730cf01c6fa4ac7d95841c4d67a3afde4
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:61891
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:61891

  1. 从集群中删除服务器

虽然槽位已经迁移完成,但是服务器IP信息还在集群当中,因此还需要将IP信息从集群删除

# 移除主节点10.0.0.81
[23:50:00 root@81 ~]#redis-cli -a redis --cluster del-node 10.0.0.81:6379 6a789b9a447c400581df3c63071104f16032f2c6
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Removing node 6a789b9a447c400581df3c63071104f16032f2c6 from cluster 10.0.0.81:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

#删除节点后, redis进程自动关闭

[23:51:13 root@81 ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
[23:51:34 root@81 ~]#ps aux | grep redis
root 1936 0.0 0.1 12108 1100 pts/0 R+ 23:51 0:00 grep --color=auto redis

#删除节点信息文件
[23:51:41 root@81 ~]#rm -f /var/lib/redis/nodes-6379.conf

#验证删除成功
[root@redis-node1 ~]#ss -ntl
State        Recv-Q        Send-Q   Local Address:Port      Peer Address:Port        
LISTEN       0             128            0.0.0.0:22             0.0.0.0:*           
LISTEN       0             128               [::]:22                [::]:*  

[23:52:05 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.82:6379 
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.82:6379 (104844ad...) -> 3353 keys | 5461 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 3306 keys | 5462 slots | 2 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3341 keys | 5461 slots | 1 slaves.
[OK] 10000 keys in 3 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.82:6379)
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[2730-4094],[6827-10922] (5461 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[4095-6826],[10923-12287] (5462 slots) master
   2 additional replica(s)
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[1365-2729],[12288-16383] (5461 slots) master
   1 additional replica(s)
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
S: 609ff8b268daa75a2fe0061d38f732b5ef859fb6 10.0.0.84:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.


#删除多余的从节点
[23:52:37 root@81 ~]#redis-cli -a redis --cluster del-node 10.0.0.84:6379 609ff8b268daa75a2fe0061d38f732b5ef859fb6
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Removing node 609ff8b268daa75a2fe0061d38f732b5ef859fb6 from cluster 10.0.0.84:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.


#删除集群文件
[23:54:42 root@84 ~]#rm -f /var/lib/redis/nodes-6379.conf 


[23:54:18 root@81 ~]#redis-cli -a redis --cluster check 10.0.0.82:6379 
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.82:6379 (104844ad...) -> 3353 keys | 5461 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 3306 keys | 5462 slots | 1 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3341 keys | 5461 slots | 1 slaves.
[OK] 10000 keys in 3 masters.
0.61 keys per slot on average.
>>> Performing Cluster Check (using node 10.0.0.82:6379)
M: 104844ad87a3e145884b520980cedc70232986f7 10.0.0.82:6379
   slots:[2730-4094],[6827-10922] (5461 slots) master
   1 additional replica(s)
S: bd496dc81e17ccfdf8431d136254d083ab4bb145 10.0.0.86:6379
   slots: (0 slots) slave
   replicates 1e4c9a3e6fe5fce70617bf417ef76463a33e334f
M: 7ccff9f724e4508303a8184df5b0903789979759 10.0.0.87:6379
   slots:[0-1364],[4095-6826],[10923-12287] (5462 slots) master
   1 additional replica(s)
S: 0c9c433017111a3a20af6583840a824c1c020630 10.0.0.88:6379
   slots: (0 slots) slave
   replicates 7ccff9f724e4508303a8184df5b0903789979759
M: 1e4c9a3e6fe5fce70617bf417ef76463a33e334f 10.0.0.83:6379
   slots:[1365-2729],[12288-16383] (5461 slots) master
   1 additional replica(s)
S: 34f27507a24e436a121527344fab181e9877404a 10.0.0.85:6379
   slots: (0 slots) slave
   replicates 104844ad87a3e145884b520980cedc70232986f7
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

[23:55:04 root@81 ~]#redis-cli -a redis --cluster info  10.0.0.82:6379 
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.0.0.82:6379 (104844ad...) -> 3353 keys | 5461 slots | 1 slaves.
10.0.0.87:6379 (7ccff9f7...) -> 3306 keys | 5462 slots | 1 slaves.
10.0.0.83:6379 (1e4c9a3e...) -> 3341 keys | 5461 slots | 1 slaves.
[OK] 10000 keys in 3 masters.
0.61 keys per slot on average.

#查看集群信息
[23:55:45 root@81 ~]#redis-cli -a redis -h 10.0.0.82 CLUSTER INFO
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:12
cluster_my_epoch:11
cluster_stats_messages_ping_sent:8248
cluster_stats_messages_pong_sent:3768
cluster_stats_messages_meet_sent:4
cluster_stats_messages_fail_sent:8
cluster_stats_messages_auth-ack_sent:2
cluster_stats_messages_update_sent:45
cluster_stats_messages_sent:12075
cluster_stats_messages_ping_received:3764
cluster_stats_messages_pong_received:3744
cluster_stats_messages_meet_received:4
cluster_stats_messages_fail_received:2
cluster_stats_messages_auth-req_received:2
cluster_stats_messages_update_received:2
cluster_stats_messages_received:7518

10.3.3 Cluster的局限性

大多数情况, 客户端性能会降低, 因为无论查询还是写入, 都要涉及hash计算, 找到槽位和节点信息
命令无法跨节点使用: 比如, mget, keys, scan, flush, sinter等
客户端维护更复杂, SDK和应用本身消耗(例如更多的连接池)
不支持多个数据库: 集群模式只有一个db 0
复制只支持一层: 不支持树形复制结构, 不支持级联复制, 只可以一主多从
key事务和Lua支持有限: 操作的key必须是在一个节点, Lua和事务无法跨节点使用
架构更改时, 比如从主从, 变成哨兵, 或者从哨兵变成cluster, 应用程序的代码需要更改, 比如python程序在不同集群下使用的库是不一样的
多个节点长时间运行后, 可能出现某个节点的数据量大, 消耗内存, 并且要接受更多的请求访问, 出现集群倾斜,

发生倾斜的原因:

1. 节点和槽位分配布局
2. 不同槽位对应的键值数量差异较大
3. 包含bigkey, 建议少用
4. 内存相关配置不一致
5. 热点数据不均衡: 一致性不高时, 可以使用本地缓存和MQ
上一篇下一篇

猜你喜欢

热点阅读