Redis进阶篇08 — 哨兵

理论概述

Q:什么是哨兵?

在《Redis进阶篇07—复制技术(四)replica 与 sub-replica》文章中,我们提到了主从复制的优缺点:

优点

  • 采用异步复制,使用较低的经济成本实现部分高可用;
  • 容灾备份;
  • 复制是非阻塞式;
  • 配置简单,实现容易。

缺点

  • 中心架构,由于所有的写操作都在 master 上,因此 replica 的机器数量不宜过多;
  • 在线扩容较复杂,集群才能做到在线扩容;
  • 不管是 master 故障还是 replica 故障,主从复制没有自动恢复机制,需要运维人员人工干预
  • 无数据分片机制,因此针对海量数据的存储显得有些乏力。

为了解决「不管是 master 故障还是 replica 故障,主从复制没有自动恢复机制,需要运维人员人工干预」这个问题,让高可用上到一个级别,于是就有了我们的 哨兵

哨兵(Sentinel):由一个或多个 Sentinel 实例组成的 Sentinel 系统(它们本质是工作在特殊模式的 Redis 服务器),用来监控主从复制当中的 master 和 replica,相当于有人在巡逻检查 主从复制 当中所有机器的 Redis 实例。一旦哨兵发现了 master 宕机或未按照预期工作,则投票选出的哨兵领导者会从剩余的 replica 中选出一台充当 master 角色且修改其他 replica 的 master 指向,继续对外提供服务,自动完成故障转移。相关文档参阅这里—— https://redis.io/docs/management/sentinel/

根据官网介绍,哨兵可以做到:

  • 监控 – 哨兵会不间断地检查 master 和 replica 是否在按照预期进行工作
  • 通知 – 若被监控的 Redis 实例有问题,可以通过 API 来通知系统管理员或其他计算机程序
  • 自动故障转移 – 当 master 故障时,自动将某一个 replica 选举成 master 角色。

注:主从同步(主从复制)是基于复制技术实现高可用的其中一个,哨兵(sentinel)和集群(cluster)也是基于复制技术演化而来。不管您采用的是 「主从复制」 还是 「主从复制 + 哨兵」,它们都是基于复制技术的非集群式的高可用(或称「伪高可用」更加合适)。要实现真正的高可用,则需要采用 Redis 集群(Cluster),有数据分片机制,后面会介绍。

在您的生产环境正式部署哨兵之前,您需要知道一些事情:

  • 您至少需要三个 Sentinel 实例才能实现可靠的部署(最好是奇数实例,方便投票选出 leader )
  • 哨兵节点不负责实际的数据存储(常规的 set、mset 等命令不适用),因此「主从复制 + 哨兵」并不能提高数据的存储上限
  • 「主从复制 + 哨兵」并不能提高并发量
  • 「主从复制 + 哨兵」并不能保证数据零丢失
  • 哨兵本质是工作在特殊模式的 Redis 服务器,有专属的配置文件而不是使用 redis.conf
  • 运行在哨兵模式下的机器默认需要占用 26379/TCP 端口
  • 启动时,先启动 master、然后启动各个 replica,最后才启动各个 sentinel;关闭时,则相反
  • Redis 实例的认证密码需要统一,这样才能实现全自动的故障转移
  • 推动故障转移这件事由各个 哨兵实例 选出的哨兵领导者(leader)完成

一个典型的 哨兵 部署方案如下图所示,3个哨兵同时与 master 和 replica 建立连接,且哨兵与哨兵之间也互相建立了连接。每个哨兵都与 master 和各个 replica 建立了连接,这样就知晓了每个 Redis 服务的健康状态。

基本信息

要求:一主两从,三个哨兵节点(三个哨兵实例)

相关信息如下表所示:

OS 环境 主机名 Redis 版本 IP 时间同步程序
RL 8.9 全新安装 Master 7.2.3(源代码) 192.168.100.3/24 chrony
RL 8.9 全新安装 Replica1 7.2.3(源代码) 192.168.100.4/24 chrony
RL 8.9 全新安装 Replica2 7.2.3(源代码) 192.168.100.5/24 chrony
RL 8.9 全新安装 Sentinel1 7.2.3(源代码) 10.1.1.3/24 chrony
RL 8.9 全新安装 Sentinel2 7.2.3(源代码) 10.1.1.4/24 chrony
RL 8.9 全新安装 Sentinel3 7.2.3(源代码) 10.1.1.5/24 chrony

由于故障转移时需要认证密码,因此在一主两从中,我们的 Redis 实例设置为统一的认证密码 —— MyPassword

配置 Master(192.168.100.3)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的 Redis 操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建持久化所需要的目录、创建所需要的日志目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/DB/ && mkdir /usr/local/redis/logs/
## 复制配置文件到目录中
Shell > cp -p /usr/local/src/redis-7.2.3/redis.conf  /usr/local/redis/config/

# 一些基本的配置修改工作,这些配置参数代表的含义你应该很清楚了
## 开启的是混合持久化
Shell > vim /usr/local/redis/config/redis.conf
...
bind 192.168.100.3
...
protected-mode no
...
daemonize yes
...
pidfile /var/run/master_6379.pid
...
loglevel notice
logfile /usr/local/redis/logs/master-redis.log
...
dbfilename master-dump.rdb
...
dir /usr/local/redis/DB/
...
masterauth MyPassword         ← Master 从故障中恢复过来且变更为 replica 角色,需要重新进行身份认证,因此需要该参数
...
requirepass MyPassword
...
rename-command flushall ""
rename-command flushdb ""
...
appendonly yes
appendfilename "master-appendonly.aof"
appenddirname "master-appendonlydir"
...
aof-use-rdb-preamble yes
...

# 尝试启动
Shell > /usr/local/redis/bin/redis-server /usr/local/redis/config/redis.conf
Shell > ss -tulnp
Netid    State     Recv-Q   Send-Q   Local Address:Port        Peer Address:Port     Process
udp      UNCONN    0        0            127.0.0.1:323             0.0.0.0:*         users:(("chronyd",pid=774,fd=5))
udp      UNCONN    0        0                [::1]:323                [::]:*         users:(("chronyd",pid=774,fd=6))
tcp      LISTEN    0        511      192.168.100.3:6379            0.0.0.0:*         users:(("redis-server",pid=9933,fd=6))
tcp      LISTEN    0        128            0.0.0.0:22              0.0.0.0:*         users:(("sshd",pid=788,fd=3))
tcp      LISTEN    0        128               [::]:22                 [::]:*         users:(("sshd",pid=788,fd=4))

# 查阅日志
Shell > cat /usr/local/redis/logs/master-redis.log
9965:M 15 Dec 2023 11:18:19.385 * Server initialized
9965:M 15 Dec 2023 11:18:19.386 * Reading RDB base file on AOF loading...
9965:M 15 Dec 2023 11:18:19.386 * Loading RDB produced by version 7.2.3
9965:M 15 Dec 2023 11:18:19.386 * RDB age 263 seconds
9965:M 15 Dec 2023 11:18:19.386 * RDB memory usage when created 0.83 Mb
9965:M 15 Dec 2023 11:18:19.386 * RDB is base AOF
9965:M 15 Dec 2023 11:18:19.386 * Done loading RDB, keys loaded: 0, keys expired: 0.
9965:M 15 Dec 2023 11:18:19.386 * DB loaded from base file master-appendonly.aof.1.base.rdb: 0.001 seconds
9965:M 15 Dec 2023 11:18:19.386 * DB loaded from append only file: 0.001 seconds
9965:M 15 Dec 2023 11:18:19.386 * Opening AOF incr file master-appendonly.aof.1.incr.aof on server start
9965:M 15 Dec 2023 11:18:19.386 * Ready to accept connections tcp

到这里,我们的 Master 配置完成。

配置 Replica1(192.168.100.4)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的 Redis 操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建持久化所需要的目录、创建所需要的日志目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/DB/ && mkdir /usr/local/redis/logs/
## 复制配置文件到目录中
Shell > cp -p /usr/local/src/redis-7.2.3/redis.conf  /usr/local/redis/config/

# 基本的配置修改工作
Shell > vim /usr/local/redis/config/redis.conf
...
bind 192.168.100.4
...
protected-mode no
...
daemonize yes
...
pidfile /var/run/replica1_6379.pid
...
loglevel notice
logfile /usr/local/redis/logs/replica1-redis.log
...
dbfilename replica1-dump.rdb
...
dir /usr/local/redis/DB/
...
replicaof 192.168.100.3 6379    ← 指定 master 的ip和端口
...
masterauth MyPassword          ← 指定认证密码
...
requirepass MyPassword
...
rename-command flushall ""
rename-command flushdb ""
...
appendonly yes
appendfilename "replica1-appendonly.aof"
appenddirname "replica1-appendonlydir"
...
aof-use-rdb-preamble yes
...

# 尝试启动
Shell > /usr/local/redis/bin/redis-server /usr/local/redis/config/redis.conf
Shell > ss -tulnp
Netid    State     Recv-Q    Send-Q       Local Address:Port      Peer Address:Port   Process
udp      UNCONN    0         0                127.0.0.1:323           0.0.0.0:*       users:(("chronyd",pid=705,fd=5))
udp      UNCONN    0         0                    [::1]:323              [::]:*       users:(("chronyd",pid=705,fd=6))
tcp      LISTEN    0         511          192.168.100.4:6379          0.0.0.0:*       users:(("redis-server",pid=9943,fd=6))
tcp      LISTEN    0         128                0.0.0.0:22             0.0.0.0:*      users:(("sshd",pid=734,fd=3))
tcp      LISTEN    0         128                   [::]:22                [::]:*      users:(("sshd",pid=734,fd=4))

# 查看日志,有如下的文本提示即表示复制成功
Shell > cat /usr/local/redis/logs/replica1-redis.log
...
9943:S 15 Dec 2023 11:32:35.718 * Connecting to MASTER 192.168.100.3:6379
9943:S 15 Dec 2023 11:32:35.718 * MASTER <-> REPLICA sync started
...
9943:S 15 Dec 2023 11:32:40.647 * MASTER <-> REPLICA sync: Finished with success
...
9943:S 15 Dec 2023 11:32:40.669 * Background AOF rewrite finished successfully

此时的 Replica1 配置完成

配置 Replica2(192.168.100.5)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的 Redis 操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建持久化所需要的目录、创建所需要的日志目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/DB/ && mkdir /usr/local/redis/logs/
## 复制配置文件到目录中
Shell > cp -p /usr/local/src/redis-7.2.3/redis.conf  /usr/local/redis/config/

# 基本的配置修改工作
Shell > vim /usr/local/redis/config/redis.conf
...
bind 192.168.100.5
...
protected-mode no
...
daemonize yes
...
pidfile /var/run/replica2_6379.pid
...
loglevel notice
logfile /usr/local/redis/logs/replica2-redis.log
...
dbfilename replica2-dump.rdb
...
dir /usr/local/redis/DB/
...
replicaof 192.168.100.3 6379    ← 指定 master 的ip和端口
...
masterauth MyPassword          ← 指定认证密码
...
requirepass MyPassword
...
rename-command flushall ""
rename-command flushdb ""
...
appendonly yes
appendfilename "replica2-appendonly.aof"
appenddirname "replica2-appendonlydir"
...
aof-use-rdb-preamble yes
...

# 尝试启动
Shell > /usr/local/redis/bin/redis-server /usr/local/redis/config/redis.conf
Shell > ss -tulnp

# 查阅日志
Shell > cat /usr/local/redis/logs/replica2-redis.log
...
11665:S 15 Dec 2023 11:55:59.900 * Ready to accept connections tcp
11665:S 15 Dec 2023 11:55:59.900 * Connecting to MASTER 192.168.100.3:6379
11665:S 15 Dec 2023 11:55:59.901 * MASTER <-> REPLICA sync started
...
11665:S 15 Dec 2023 11:56:04.814 * MASTER <-> REPLICA sync: Finished with success
...
11665:S 15 Dec 2023 11:56:04.844 * Background AOF rewrite finished successfully

此时的 Replica2 配置完成。

sentinel.conf 配置的参数说明

前面提到,哨兵是工作在特殊模式的 Redis 服务器,有专属的配置文件而不是使用 redis.conf 。专属的配置文件就放在 Redis 源码的压缩包里,常用参数以及说明如下:

  • protected-mode no – 哨兵是否运行在保护模式下
  • port 26379 – 哨兵工作时占用的端口号
  • daemonize no – 哨兵是否以守护进程后台的方式运行。必须修改为 yes
  • pidfile /var/run/redis-sentinel.pid – 进程 id 位置
  • loglevel notice – 日志级别
  • logfile "" – 日志文件所在的位置。必须修改
  • sentinel announce-ip <ip> – 被注释的行。在一些需要端口映射或 IP 转换的网络环境才使用,比如 NAT 或 docker 等之类的。
  • sentinel announce-port <port> – 被注释的行。在一些需要端口映射或 IP 转换的网络环境才使用,比如 NAT 或 docker 等之类的。
  • dir /tmp – 进程运行时的工作目录

  • sentinel monitor mymaster 127.0.0.1 6379 2 – 监控 master 的 IP 和端口。mymaster 表示 master 的名称。后面的 2 表示法定人数(quorum),有多少数量的哨兵认可客观下线,这涉及到 主观下线客观下线,稍后将介绍
  • sentinel auth-pass <master-name> <password> – 哨兵连接 master 的密码
  • sentinel down-after-milliseconds mymaster 30000 – 在多少毫秒之后,如果 master 没有应答,则哨兵会主观认为 master 已经宕机下线了

  • requirepass <password> – 给哨兵自身设置身份验证密码,通常而言,如果您为哨兵设置了密码,那么在与其他哨兵连接时将使用相同的密码,换言之,哨兵身份验证的密码都需要相同。
  • sentinel parallel-syncs mymaster 1 – 在故障切换期间,配置多少数量的 replica 指向新的 master
  • sentinel failover-timeout mymaster 180000 – 设置故障转移的超时时间,以毫秒为单位,超过该时间,则表示故障转移失败。默认 180 秒。

Q:什么是客观下线(Objectively Down,ODOWN)?

众所周知,因各种不可控的因素,网络波动属于正常的事情。有时,某一个 哨兵 实例误认为 master 已经宕机或未按照正常预期工作(比如哨兵未收到单次的心跳 ping 数据包),但其实 master 还在正常工作,于是找来其他 哨兵 实例互相沟通,通过法定的认可数量确认 master 是否真的宕机了,「sentinel monitor mymaster 127.0.0.1 6379 2」这个配置参数值当中的 2 就表示有多少数量的哨兵认可客观下线,确认之后,故障迁移才能正式开始,进入到下一步工作,保证公平性。

Q:什么是主观下线(Subjective Down,SDOWN)?

单个 哨兵 实例认为 master 已经宕机,也就是依据这个参数——「sentinel down-after-milliseconds mymaster 30000

配置 Sentinel1(10.1.1.3)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的交互操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建所需要的日志目录
## 哨兵不存储实际的数据,因此不需要持久化目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/logs/
## 复制配置文件到目录中,这次拷贝的是 sentinel.conf 
Shell > cp -p /usr/local/src/redis-7.2.3/sentinel.conf /usr/local/redis/config/

# 哨兵基本的配置
Shell > vim /usr/local/redis/config/sentinel.conf
...
daemonize yes
...
pidfile /var/run/sentinel1.pid
...
logfile /usr/local/redis/logs/sentinel1.log
...
sentinel monitor MyMaster 192.168.100.3 6379 2
...
sentinel auth-pass MyMaster MyPassword
...
sentinel down-after-milliseconds MyMaster 30000
...
sentinel parallel-syncs MyMaster 1
...
sentinel failover-timeout MyMaster 180000
...
SENTINEL master-reboot-down-after-period MyMaster 0

# 启动哨兵实例的进程
Shell > /usr/local/redis/bin/redis-sentinel /usr/local/redis/config/sentinel.conf

# 查阅日志
## 注意这里的哨兵 ID 为 ac72 开头
Shell > cat /usr/local/redis/logs/sentinel1.log
...
9460:X 15 Dec 2023 18:10:01.450 * Running mode=sentinel, port=26379.
9460:X 15 Dec 2023 18:10:01.452 * Sentinel new configuration saved on disk
9460:X 15 Dec 2023 18:10:01.452 * Sentinel ID is ac72b0837dd3fd5c7227c4ca87535be4bd246691
9460:X 15 Dec 2023 18:10:01.452 # +monitor master MyMaster 192.168.100.3 6379 quorum 2
9460:X 15 Dec 2023 18:10:01.458 * +slave slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379
9460:X 15 Dec 2023 18:10:01.459 * Sentinel new configuration saved on disk
9460:X 15 Dec 2023 18:10:01.459 * +slave slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9460:X 15 Dec 2023 18:10:01.460 * Sentinel new configuration saved on disk

配置 Sentinel2(10.1.1.4)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的交互操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建所需要的日志目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/logs/
## 拷贝 sentinel.conf 
Shell > cp -p /usr/local/src/redis-7.2.3/sentinel.conf /usr/local/redis/config/

# 哨兵基本的配置
Shell > vim /usr/local/redis/config/sentinel.conf
...
daemonize yes
...
pidfile /var/run/sentinel2.pid
...
logfile /usr/local/redis/logs/sentinel2.log
...
sentinel monitor MyMaster 192.168.100.3 6379 2
...
sentinel auth-pass MyMaster MyPassword
...
sentinel down-after-milliseconds MyMaster 30000
...
sentinel parallel-syncs MyMaster 1
...
sentinel failover-timeout MyMaster 180000
...
SENTINEL master-reboot-down-after-period MyMaster 0

# 启动哨兵实例的进程
Shell > /usr/local/redis/bin/redis-sentinel /usr/local/redis/config/sentinel.conf

# 查阅日志
Shell > cat /usr/local/redis/logs/sentinel2.log
...
9423:X 15 Dec 2023 18:23:02.644 * +sentinel sentinel ac72b0837dd3fd5c7227c4ca87535be4bd246691 10.1.1.3 26379 @ MyMaster 192.168.100.3 6379
9423:X 15 Dec 2023 18:23:02.645 * Sentinel new configuration saved on disk
9423:X 15 Dec 2023 18:23:30.840 # +sdown slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379
9423:X 15 Dec 2023 18:23:30.840 # +sdown slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379

配置 Sentinel3(10.1.1.5)

配置时间同步程序

Shell > vim /etc/chrony.conf
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
server ntp3.tencent.com iburst
server ntp4.tencent.com iburst
...

Shell > systemctl restart chronyd.service

执行相关的交互操作

Shell > wget  https://github.com/redis/redis/archive/7.2.3.tar.gz
Shell > tar -zvxf redis-7.2.3.tar.gz -C /usr/local/src/ 

Shell > dnf config-manager --set-enabled powertools && dnf  -y install epel-release
Shell > dnf -y install jemalloc-devel make gcc-c++ python3
Shell > mkdir -p /usr/local/redis/
Shell > cd  /usr/local/src/redis-7.2.3/ &&  make  &&  make  PREFIX=/usr/local/redis/  install

# 创建配置文件所在的目录,创建所需要的日志目录
Shell > mkdir /usr/local/redis/config/ && mkdir /usr/local/redis/logs/
## 拷贝 sentinel.conf 
Shell > cp -p /usr/local/src/redis-7.2.3/sentinel.conf /usr/local/redis/config/

# 哨兵基本的配置
Shell > vim /usr/local/redis/config/sentinel.conf
...
daemonize yes
...
pidfile /var/run/sentinel3.pid
...
logfile /usr/local/redis/logs/sentinel3.log
...
sentinel monitor MyMaster 192.168.100.3 6379 2
...
sentinel auth-pass MyMaster MyPassword
...
sentinel down-after-milliseconds MyMaster 30000
...
sentinel parallel-syncs MyMaster 1
...
sentinel failover-timeout MyMaster 180000
...
SENTINEL master-reboot-down-after-period MyMaster 0

# 启动哨兵实例的进程
Shell > /usr/local/redis/bin/redis-sentinel /usr/local/redis/config/sentinel.conf

# 查阅日志
Shell > cat /usr/local/redis/logs/sentinel3.log
...
9367:X 15 Dec 2023 18:33:35.666 * Sentinel new configuration saved on disk
9367:X 15 Dec 2023 18:34:03.772 # +sdown slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9367:X 15 Dec 2023 18:34:03.772 # +sdown slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379

至此,一主两从三哨兵全部配置完成且都在正常工作中。

其他说明

正常情况下

我们来看看这些主从复制架构中 Redis 实例的信息:

192.168.100.3、192.168.100.4、192.168.100.5 分别执行 info replication 命令后的输出文本信息:

# Replication
role:master
connected_slaves:2
slave0:ip=192.168.100.4,port=6379,state=online,offset=185643,lag=1
slave1:ip=192.168.100.5,port=6379,state=online,offset=185643,lag=1
master_failover_state:no-failover
master_replid:dea980fb890209c2c782214f0ea7ba70f0035ce3
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:185657
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:185657

-----------

# Replication
role:slave
master_host:192.168.100.3
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_read_repl_offset:200223
slave_repl_offset:200223   ← 复制数据时的偏移量
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:dea980fb890209c2c782214f0ea7ba70f0035ce3
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:200223
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:15
repl_backlog_histlen:200209

------------

# Replication
role:slave
master_host:192.168.100.3
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_read_repl_offset:204753
slave_repl_offset:204753  ← 复制数据时的偏移量
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:dea980fb890209c2c782214f0ea7ba70f0035ce3
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:204753
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:15
repl_backlog_histlen:204739
# 在 Master 上写入数据:
192.168.100.3:6379> set name zhang

# replica 上查看
192.168.100.4:6379> keys *
1) "name"
192.168.100.4:6379> get name
"zhang"

192.168.100.5:6379> keys *
1) "name"
192.168.100.5:6379> get name
"zhang"

不管查看哪一个 哨兵实例 的配置文件,其文件内容的底部都会被 重写。比如 Sentinel1 (10.1.1.3):

模拟 Master 宕机

Shell (192.168.100.3)> killall redis-server

在 192.168.100.4、192.168.100.5 上执行 info replication 命令:

# Replication
role:slave
master_host:192.168.100.5
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_read_repl_offset:637898
slave_repl_offset:637898
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:e45e3c4e6339b9c46ec6d4ba6ced822a6875dc25
master_replid2:dea980fb890209c2c782214f0ea7ba70f0035ce3
master_repl_offset:637898
second_repl_offset:575984
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:15
repl_backlog_histlen:637884

----------

# Replication
role:master
connected_slaves:1
slave0:ip=192.168.100.4,port=6379,state=online,offset=641598,lag=0
master_failover_state:no-failover
master_replid:e45e3c4e6339b9c46ec6d4ba6ced822a6875dc25
master_replid2:dea980fb890209c2c782214f0ea7ba70f0035ce3
master_repl_offset:641598
second_repl_offset:575984
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:15
repl_backlog_histlen:641584

这次,将 192.168.100.5 当成了 master 角色,故障转移成功。此时 192.168.100.5 的 redis.conf 文件内容的底部会写入一些配置参数,同时 「replicaof 192.168.100.3 6379」 这一行会被删除。

此时,Sentinel1(10.1.1.3)的日志会写入这些内容:

9537:X 15 Dec 2023 19:50:49.012 # +sdown master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.065 # +odown master MyMaster 192.168.100.3 6379 #quorum 3/2
9537:X 15 Dec 2023 19:50:49.065 # +new-epoch 1
9537:X 15 Dec 2023 19:50:49.065 # +try-failover master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.079 * Sentinel new configuration saved on disk
9537:X 15 Dec 2023 19:50:49.079 # +vote-for-leader ac72b0837dd3fd5c7227c4ca87535be4bd246691 1
9537:X 15 Dec 2023 19:50:49.082 * 64aece52e32b27915d02c68c26a4d2eb994c8af1 voted for ac72b0837dd3fd5c7227c4ca87535be4bd246691 1
9537:X 15 Dec 2023 19:50:49.082 * c2ea57fc2271bba350e09c6dd698128521abe4be voted for ac72b0837dd3fd5c7227c4ca87535be4bd246691 1
9537:X 15 Dec 2023 19:50:49.132 # +elected-leader master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.132 # +failover-state-select-slave master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.194 # +selected-slave slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.194 * +failover-state-send-slaveof-noone slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:49.266 * +failover-state-wait-promotion slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:50.115 * Sentinel new configuration saved on disk
9537:X 15 Dec 2023 19:50:50.115 # +promoted-slave slave 192.168.100.5:6379 192.168.100.5 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:50.115 # +failover-state-reconf-slaves master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:50.189 * +slave-reconf-sent slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:51.173 * +slave-reconf-inprog slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:51.173 * +slave-reconf-done slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:51.238 # -odown master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:51.238 # +failover-end master MyMaster 192.168.100.3 6379
9537:X 15 Dec 2023 19:50:51.238 # +switch-master MyMaster 192.168.100.3 6379 192.168.100.5 6379
9537:X 15 Dec 2023 19:50:51.238 * +slave slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.5 6379
9537:X 15 Dec 2023 19:50:51.238 * +slave slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379
9537:X 15 Dec 2023 19:50:51.251 * Sentinel new configuration saved on disk
9537:X 15 Dec 2023 19:51:21.239 # +sdown slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379

Sentinel2(10.1.1.4)的日志会写入这些内容:

9481:X 15 Dec 2023 19:50:49.004 # +sdown master MyMaster 192.168.100.3 6379
9481:X 15 Dec 2023 19:50:49.081 * Sentinel new configuration saved on disk
9481:X 15 Dec 2023 19:50:49.081 # +new-epoch 1
9481:X 15 Dec 2023 19:50:49.083 * Sentinel new configuration saved on disk
9481:X 15 Dec 2023 19:50:49.083 # +vote-for-leader ac72b0837dd3fd5c7227c4ca87535be4bd246691 1
9481:X 15 Dec 2023 19:50:49.083 # +odown master MyMaster 192.168.100.3 6379 #quorum 2/2
9481:X 15 Dec 2023 19:50:49.083 * Next failover delay: I will not start a failover before Fri Dec 15 19:56:49 2023
9481:X 15 Dec 2023 19:50:50.190 # +config-update-from sentinel ac72b0837dd3fd5c7227c4ca87535be4bd246691 10.1.1.3 26379 @ MyMaster 192.168.100.3 6379
9481:X 15 Dec 2023 19:50:50.190 # +switch-master MyMaster 192.168.100.3 6379 192.168.100.5 6379
9481:X 15 Dec 2023 19:50:50.191 * +slave slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.5 6379
9481:X 15 Dec 2023 19:50:50.191 * +slave slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379
9481:X 15 Dec 2023 19:50:50.192 * Sentinel new configuration saved on disk
9481:X 15 Dec 2023 19:51:20.269 # +sdown slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379

Sentinel3(10.1.1.5)的日志会写入这些内容:

9430:X 15 Dec 2023 19:50:48.979 # +sdown master MyMaster 192.168.100.3 6379
9430:X 15 Dec 2023 19:50:49.081 * Sentinel new configuration saved on disk
9430:X 15 Dec 2023 19:50:49.081 # +new-epoch 1
9430:X 15 Dec 2023 19:50:49.082 * Sentinel new configuration saved on disk
9430:X 15 Dec 2023 19:50:49.082 # +vote-for-leader ac72b0837dd3fd5c7227c4ca87535be4bd246691 1
9430:X 15 Dec 2023 19:50:50.081 # +odown master MyMaster 192.168.100.3 6379 #quorum 3/2
9430:X 15 Dec 2023 19:50:50.081 * Next failover delay: I will not start a failover before Fri Dec 15 19:56:49 2023
9430:X 15 Dec 2023 19:50:50.190 # +config-update-from sentinel ac72b0837dd3fd5c7227c4ca87535be4bd246691 10.1.1.3 26379 @ MyMaster 192.168.100.3 6379
9430:X 15 Dec 2023 19:50:50.190 # +switch-master MyMaster 192.168.100.3 6379 192.168.100.5 6379
9430:X 15 Dec 2023 19:50:50.190 * +slave slave 192.168.100.4:6379 192.168.100.4 6379 @ MyMaster 192.168.100.5 6379
9430:X 15 Dec 2023 19:50:50.190 * +slave slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379
9430:X 15 Dec 2023 19:50:50.192 * Sentinel new configuration saved on disk
9430:X 15 Dec 2023 19:51:20.221 # +sdown slave 192.168.100.3:6379 192.168.100.3 6379 @ MyMaster 192.168.100.5 6379

Q:那如果 192.168.100.3 恢复了,会发生什么?

192.168.100.3 成为 replica 角色,且它的 redis.conf 文件内容的底部会被重写,指向新的 master。如下:

Shell (192.168.100.3)> /usr/local/redis/bin/redis-server /usr/local/redis/config/redis.conf

Shell (192.168.100.3)> /usr/local/redis/bin/redis-cli -h 192.168.100.3 -p 6379 -a "MyPassword"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.

192.168.100.3:6379> info replication
# Replication
role:slave
master_host:192.168.100.5  ← master 变了
master_port:6379
master_link_status:up      ← up 状态
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_read_repl_offset:1079503
slave_repl_offset:1079503
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:e45e3c4e6339b9c46ec6d4ba6ced822a6875dc25
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1079503
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1078665
repl_backlog_histlen:839

Shell (192.168.100.3)> tail -n 10 /usr/local/redis/config/redis.conf
#
# ignore-warnings ARM64-COW-BUG

# Generated by CONFIG REWRITE
save 3600 1
save 300 100
save 60 10000
latency-tracking-info-percentiles 50 99 99.9
replicaof 192.168.100.5 6379
user default on sanitize-payload #dc1e7c03e162397b355b6f1c895dfdf3790d98c10b920c55e91272b8eecada2a ~* &* +@all

Q&A

Q:整个哨兵的运行过程是怎么样的?

整个主从复制正常工作时:

  • 每个 哨兵实例 每秒都在发送 ping 心跳包,以监控 master 和各个 replica 的 Redis 健康状态

Master 故障后,通过上面的日志,会发生:

  • 首先是每个哨兵每秒都会发送 ping 心跳数据包后,默认 30s 后 master 没有应答,即达成了 SDOWN(主观下线) 条件;

  • 单个 哨兵 实例认为 master 宕机依旧不行,需要和其他 哨兵 实例沟通协商,通过互相确认机制,达成 ODOWN(客观下线) 条件,确认 master 一定宕机了;

  • 哨兵 与 哨兵 之间投票选出一个 领导者(leader),它的工作就是完成故障转移工作(在上面的日志中,10.1.1.3 首先给自己投了一票,哨兵ID 为 ac72b 开头)。既然是故障转移,就必须从剩余的 replica 中选出一个充当 master 角色,依据由高到低有:

    • Redis 实例配置文件中的优先级(replica-priority 100),值的数字越小,优先级越高。若是 0 值,则这个 replica 不具备成为 master 的资格。
    • 若优先级是一样的,则比较各个 replica 的复制数据的 offset (偏移量),偏移量大的那个 replica 会成为 master(info replication 命令输出的 "slave_repl_offset" 这一行,如您所见,我并没有配置优先级,因此上面选择 192.168.100.5 作为新的 Master 就是依据偏移量)
    • 若复制数据的偏移量一样,则比较 Run ID,小的那个 replica 会成为 master
  • 故障转移之后,各个哨兵的日志会写入一些记录;同时,各个 Redis 实例的 redis.conf 的底部会写入一些配置项指向新的 master ,如果这个 Redis 实例是 master 角色,则会删除旧的「replicaof」这一行参数。

  • 在故障转移期间,那些已经连接到 Redis 实例的客户端会出现短暂的错误,重连即可。

注:若要查看 Redis 的 run id,请在 Redis 交互界面中执行 info server 命令。

Q:哨兵的投票到底是干嘛的?参数「sentinel monitor mymaster 127.0.0.1 6379 2」里的 2 是否也是指投票数?

哨兵之间的投票是为了选出哨兵领导者,推动故障转移这件事。请注意!不是哨兵投票选出剩余的 replica 中谁是故障转移后的新 master,不要混淆。 2 指的是有多少数量的 哨兵 认可客观下线。

Q:哨兵与哨兵之间是怎么选出领导者的?

Raft 算法。

Q:「主从复制 + 哨兵」在实际的业务场景中使用占比多吗?

较少,通常而言都会使用 Redis 集群。

Avatar photo

关于 陸風睿

GNU/Linux 从业者、开源爱好者、技术钻研者,撰写文档既是兴趣也是工作内容之一。Q - "281957576";WeChat - "jiulongxiaotianci"
用一杯咖啡支持我们,我们的每一篇[文档]都经过实际操作和精心打磨,而不是简单地从网上复制粘贴。期间投入了大量心血,只为能够真正帮助到您。
暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇