Andy's blog

redis sentinel 是做什么的?

监控redis master 当 redis master 在设定的时间内失去响应时 redis会从众多slave中选择一个作为新的master, 并将其它的从指向新的master,并修改包括master在内的redis-server的配置文件.当重新启动原来的master时,也将它指向新选举出来的master.

sentinel的工作原理是怎样的?

每个Sentinel以每秒钟一次的频率向它所知的Master，Slave以及其他 Sentinel 实例发送一个 PING 命令

如果一个实例（instance）距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值，则这个实例会被 Sentinel 标记为主观下线。

如果一个Master被标记为主观下线，则正在监视这个Master的所有 Sentinel 要以每秒一次的频率确认Master的确进入了主观下线状态。

当有足够数量的 Sentinel（大于等于配置文件指定的值）在指定的时间范围内确认Master的确进入了主观下线状态，则Master会被标记为客观下线

在一般情况下，每个 Sentinel 会以每 10 秒一次的频率向它已知的所有Master，Slave发送 INFO 命令

当Master被 Sentinel 标记为客观下线时，Sentinel 向下线的 Master 的所有 Slave 发送 INFO 命令的频率会从 10 秒一次改为每秒一次

若没有足够数量的 Sentinel 同意 Master 已经下线， Master 的客观下线状态就会被移除。

若 Master 重新向 Sentinel 的 PING 命令返回有效回复， Master 的主观下线状态就会被移除。

主观下线和客观下线

主观下线：Subjectively Down，简称 SDOWN，指的是当前 Sentinel 实例对某个redis服务器做出的下线判断。

客观下线：Objectively Down，简称 ODOWN，指的是多个 Sentinel 实例在对Master Server做出 SDOWN 判断，并且通过 SENTINEL is-master-down-by-addr 命令互相交流之后，得出的Master Server下线判断，然后开启failover.

SDOWN适合于Master和Slave，只要一个 Sentinel 发现Master进入了ODOWN，这个 Sentinel 就可能会被其他 Sentinel 推选出，并对下线的主服务器执行自动故障迁移操作。

ODOWN只适用于Master，对于Slave的 Redis 实例，Sentinel 在将它们判断为下线前不需要进行协商，所以Slave的 Sentinel 永远不会达到ODOWN。

redis sentinel具体实现过程

配置redis-server 与sentinel

[root@master tmp]# ll /etc/redis-*

-rw-r--r-- 1 root root 145 Nov 7 17:44 /etc/redis-6379.conf

-rw-r--r-- 1 root root 93 Nov 7 17:42 /etc/redis-6380.conf

-rw-r--r-- 1 root root 115 Nov 7 17:42 /etc/redis-6381.conf

-rw-r--r-- 1 root root 556 Nov 7 17:42 /etc/redis-sentinel-26379.conf

-rw-r--r-- 1 root root 556 Nov 7 17:42 /etc/redis-sentinel-26380.conf

-rw-r--r-- 1 root root 556 Nov 7 17:42 /etc/redis-sentinel-26381.conf

sentinel配置文件:

// Sentinel节点的端口

port 26379

dir /var/redis/data/

logfile "26379.log"

// 当前Sentinel节点监控 192.168.119.10:6379 这个主节点

// 2代表判断主节点失败至少需要2个Sentinel节点节点同意

// mymaster是主节点的别名

sentinel monitor mymaster 192.168.119.10 6379 2

//每个Sentinel节点都要定期PING命令来判断Redis数据节点和其余Sentinel节点是否可达，如果超过30000毫秒30s且没有回复，则判定不可达

sentinel down-after-milliseconds mymaster 30000

//当Sentinel节点集合对主节点故障判定达成一致时，Sentinel领导者节点会做故障转移操作，选出新的主节点，

原来的从节点会向新的主节点发起复制操作，限制每次向新的主节点发起复制操作的从节点个数为1

sentinel parallel-syncs mymaster 1

//故障转移超时时间为180000毫秒

sentinel failover-timeout mymaster 180000

另外两个sentinel的配置仅仅端口不同,可以通过下面的命令替换

sed 's/26379/26380/g' > redis-sentinel-26380.conf

sed 's/26379/26381/g' > redis-sentinel-26381.conf

查看哨兵是否通信

[root@master ~]# redis-cli -p 26379 info sentinel

# Sentinel

sentinel_masters:1

sentinel_tilt:0

sentinel_running_scripts:0

sentinel_scripts_queue_length:0

sentinel_simulate_failure_flags:0

master0:name=mymaster,status=ok,address=192.168.119.10:6379,slaves=2,sentinels=3

#看到最后一条信息正确即成功运行了哨兵，哨兵主节点名字叫做mymaster，状态ok，监控地址是192.168.119.10:6379，有两个从节点，3个哨兵

此时,如果master意外挂掉,sentinel会选举出新的master

Redis sentinel

redis sentinel 是做什么的?

sentinel的工作原理是怎样的?

redis sentinel具体实现过程