Deepgreen & Greenplum 高可用（二） - Master故障转移

xiaoxiao2023-09-10 169

上一篇文章中提到了Segment节点的故障转移，其中主要涉及Mirror添加、故障自动切换、故障修复后balanced到原集群状态，以及一些建议。具体请移步传送门-->《 Deepgreen & Greenplum 高可用（一） - Segment节点故障转移》。书接上文，今天来谈一谈Master节点的故障转移。一、首先从理论上来讲，要达到Master节点的单点保障，Master Standby要与Master部署在不同的服务器上。当Master节点故障时，同步程序停止，此时手动在Master主机激活Standby。激活Standby时，同步日志被用来恢复Master最后一次事务成功提交时的状态。另外，在激活Standby主机同时，可以指定一个新的Standby。下面我们开始实验： 1.首先给原有集群添加Standby gpinitstandby -s reverse <<--Standby所在主机名，与/etc/hosts中相对应 2.通过SQL查看现有Master及Standby Master状态 postgres=# select string_agg(role||'-'||hostname,'|') from gp_segment_configuration where content='-1'; string_agg -------------- p-flash|m-reverse 3.模拟Master故障并切换到Standby # 查询Master进程 ps -ef | grep postgres kill -9 3897 <<-- 上面查询到的Master进程pid # 测试是否可以连接到集群（失败） psql -d postgres #切换到Standby gpactivatestandby -d ${MASTER_DATA_DIRECTORY} # 测试是否可以连接到集群（成功） psql -d postgres #如果想在切换的同时创建一个新的Standby，可以执行如下命令 gpactivatestandby -d ${MASTER_DATA_DIRECTORY} -c new_standby_hostname 4.切换完成后，在新Master主机上连接数据库并运行ANALYZE psql dbname -c 'ANALYZE;' 至此，切换完成。二、与Mirror一样，因为Standby一般不会单独占用一台机器，通常部署在某个Segment节点之上，所以长期使用Standby接管服务，会导致Standby主机争抢Segment实例资源。通常在原Master修复后，应尽快切换为原集群状态，下面我们来做这个实验。 1.首先在原Standby主机（现已经承担Master服务）上，执行如下命令，将Standby初始化到原Master主机（刚修复的问题机器） gpinitstandby -s flash 2.在当前承担Master服务的Standby主机上停止Master服务 gpstop -m 3.在原Master主机上重新激活Master服务 gpactivatestandby -d $MASTER_DATA_DIRECTORY 4.激活之后，通过下面命令查看状态 gpstate -f 5.一旦状态正常，便可将原Standby主机重新初始化 gpinitstandby -s reverse 至此，集群已经恢复到原始状态，这里面关于原Master、现Master、原Standby和现Standby的概念，有点绕，需要认真品味。三、最后分享另外两个与Standby相关的操作 1. 同步Standby并更新到最新的同步 gpinitstandby -s standby_master_hostname -n 2.删除Standby gpinitstandby -s standby_master_hostname -r 备注： Standby可以轻松添加和删除，然而Mirror却只允许添加，不允许删除。需要注意，Master的热备Standby需要手工激活，并且使用不同的访问IP；而Segment的镜像却可以自动切换。 End~~ 相关资源：敏捷开发V1.0.pptx

最新回复(0)

Deepgreen &amp; Greenplum 高可用（二） - Master故障转移

Deepgreen & Greenplum 高可用（二） - Master故障转移