PostgreSQL高可用之repmgr(1主2从+1witness)+Pgpool-II实现主从切换+读写分离+负载均衡
一、总体架构
本文最终需要实现的目标:
1、1主2从流复制
2、读写分离+负载均衡(pgpool-II)
3、主从自动切换(repmgr)
所有的节点:安装操作系统、创建PG用户目录、安装PG软件、安装repmgr
主库:只初始化主库、启动主库归档
二、环境申请
先申请4台机器,6.66节点最后再申请,反正docker容器都是即开即用的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | -- 主库 docker rm -f lhrrepmgr64361 docker run -d --name lhrrepmgr64361 -h lhrrepmgr64361 \ --net=pg-network --ip 172.72.6.61 \ -p 64361:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrpgall:1.0 \ /usr/sbin/init docker network connect bridge lhrrepmgr64361 -- 从库1 docker rm -f lhrrepmgr64362 docker run -d --name lhrrepmgr64362 -h lhrrepmgr64362 \ --net=pg-network --ip 172.72.6.62 \ -p 64362:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrpgall:1.0 \ /usr/sbin/init docker network connect bridge lhrrepmgr64362 -- 从库2 docker rm -f lhrrepmgr64363 docker run -d --name lhrrepmgr64363 -h lhrrepmgr64363 \ --net=pg-network --ip 172.72.6.63 \ -p 64363:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrpgall:1.0 \ /usr/sbin/init docker network connect bridge lhrrepmgr64363 -- Witness库 docker rm -f lhrrepmgr64364 docker run -d --name lhrrepmgr64364 -h lhrrepmgr64364 \ --net=pg-network --ip 172.72.6.64 \ -p 64364:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrpgall:1.0 \ /usr/sbin/init docker network connect bridge lhrrepmgr64364 -- 重启4台机器 docker restart lhrrepmgr64361 lhrrepmgr64362 lhrrepmgr64363 lhrrepmgr64364 [root@docker35 ~]# docker ps | grep repmgr 31d3d31c1073 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64364->5432/tcp lhrrepmgr64364 572d4ea2c072 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64363->5432/tcp lhrrepmgr64363 6ded416b2016 lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64362->5432/tcp lhrrepmgr64362 fa38b58b8f3d lhrbest/lhrpgall:1.0 "/usr/sbin/init" 41 hours ago Up 41 hours 5433-5435/tcp, 0.0.0.0:64361->5432/tcp lhrrepmgr64361 [root@docker35 ~]# |
? 注意:
1、镜像lhrbest/lhrpgall:1.0中,申请的4台主机均已安装好PG 13和repmgr软件
2、PG安装方式为编译安装,数据库已初始化完成,用户为pg13
3、该镜像里已安装了PG13、PG12、PG11、PG10和PG9.6这几个版本,均为编译安装,本文使用PG13来操作
三、配置4台主机之间的互信
1 2 3 4 5 6 | -- 只在主库61节点上,以root用户执行: chmod +x sshUserSetup.sh ./sshUserSetup.sh -user pg13 -hosts "172.72.6.61 172.72.6.62 172.72.6.63 172.72.6.64" -advanced exverify -confirm -- 在每台主机修改权限,否则报错:Bad owner or permissions on /home/pg13/.ssh/config chmod 600 /home/pg13/.ssh/config |
依次输入:yes,回车和密码即可。
脚本下载:https://share.weiyun.com/57HUxNi
验证:
1 2 3 4 | ssh pg13@lhrrepmgr64361 date ssh pg13@lhrrepmgr64362 date ssh pg13@lhrrepmgr64363 date ssh pg13@lhrrepmgr64364 date |
第二次执行时不再提示输入yes,并且可以成功执行命令,则表示SSH对等性配置成功。
执行过程:
1 2 3 4 5 6 7 8 9 | [pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64362 date Tue Apr 27 17:15:04 CST 2021 [root@lhrrepmgr64361 /]# [pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64363 date Tue Apr 27 17:15:08 CST 2021 [root@lhrrepmgr64361 /]# [pg13@lhrrepmgr64361 /]# ssh pg13@lhrrepmgr64364 date Tue Apr 27 17:15:13 CST 2021 [pg13@lhrrepmgr64361 /]# |
四、配置repmgr
4.1、主库修改pg_hba.conf参数文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | -- 进入主库 docker exec -it lhrrepmgr64361 bash su - pg13 cat >> /pg13/pgdata/pg_hba.conf <<"EOF" local repmgr repmgr md5 host repmgr repmgr 127.0.0.1/32 md5 host repmgr repmgr 172.72.6.0/24 md5 local replication repmgr md5 host replication repmgr 127.0.0.1/32 md5 host replication repmgr 172.72.6.0/24 md5 EOF |
4.2、主库修改postgresql.conf参数文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | -- 修改参数 cat >> /pg13/pgdata/postgresql.conf <<"EOF" # 归档参数 wal_level='replica' archive_mode='on' archive_command='test ! -f /pg13/archive/%f && cp %p /pg13/archive/%f' restore_command='cp /pg13/archive/%f %p' # 主从流复制 hot_standby=on max_wal_senders=10 wal_sender_timeout=60s wal_keep_size=16MB # 主从切换参数,启用PG数据库的复制槽,PG12不需要"use_replication_slots=true"这个参数了。 max_replication_slots=10 wal_log_hints=on # 自动切换 shared_preload_libraries ='repmgr' EOF -- 重启 pg_ctl start -- 查询 psql -U postgres -h 192.168.66.35 -p 64361 select * from pg_settings where name in ('wal_level','archive_mode','archive_command'); -- 切换归档 select pg_switch_wal(); |
4.3、主库创建相关用户和数据库
1 2 3 4 5 6 7 | -- 创建相关用户和数据库 su - pg13 createuser -s repmgr createdb repmgr -O repmgr psql -h 127.0.0.1 -c "alter user repmgr with password 'lhr';" psql -h 127.0.0.1 -c "alter user repmgr set search_path to repmgr, \"\$user\",public;" |
结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | [root@lhrrepmgr64361 /]# systemctl start pg13 [root@lhrrepmgr64361 /]# systemctl status pg13 ● pg13.service - PostgreSQL database server Loaded: loaded (/etc/systemd/system/pg13.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2021-04-27 16:25:24 CST; 6s ago Docs: man:postgres(1) Process: 769 ExecStart=/pg13/pg13/bin/pg_ctl start -D ${PGDATA} -s -o -p ${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS) Main PID: 771 (postgres) CGroup: /docker/a777ef12d5ff83a9d47be51a98531bd45b42d2b008f7a25f894f3244ce9cc0d4/system.slice/pg13.service ├─771 /pg13/pg13/bin/postgres -D /pg13/pgdata -p 5432 ├─772 postgres: logger ├─774 postgres: checkpointer ├─775 postgres: background writer ├─776 postgres: walwriter ├─777 postgres: autovacuum launcher ├─778 postgres: stats collector └─779 postgres: logical replication launcher Apr 27 16:25:23 lhrrepmgr64361 systemd[1]: Starting PostgreSQL database server... Apr 27 16:25:24 lhrrepmgr64361 pg_ctl[769]: 2021-04-27 16:25:24.024 CST [771] LOG: redirecting log output to logging collector process Apr 27 16:25:24 lhrrepmgr64361 pg_ctl[769]: 2021-04-27 16:25:24.024 CST [771] HINT: Future log output will appear in directory "pg_log". Apr 27 16:25:24 lhrrepmgr64361 systemd[1]: Started PostgreSQL database server. [root@lhrrepmgr64361 /]# su - pg13 Last login: Tue Apr 27 16:24:50 CST 2021 on pts/0 [pg13@lhrrepmgr64361 ~]$ [pg13@lhrrepmgr64361 ~]$ createuser -s repmgr [pg13@lhrrepmgr64361 ~]$ [pg13@lhrrepmgr64361 ~]$ createdb repmgr -O repmgr [pg13@lhrrepmgr64361 ~]$ [pg13@lhrrepmgr64361 ~]$ psql -h 127.0.0.1 -c "alter user repmgr with password 'lhr'" ALTER ROLE [pg13@lhrrepmgr64361 ~]$ [pg13@lhrrepmgr64361 ~]$ psql -h 127.0.0.1 -c "alter user repmgr set search_path to repmgr, \"\$user\",public"; ALTER ROLE C:\Users\lhrxxt>psql -U postgres -h 192.168.66.35 -p 64361 Password for user postgres: psql (13.2) Type "help" for help. postgres=# \du List of roles Role name | Attributes | Member of -----------+------------------------------------------------------------+----------- postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} repmgr | Superuser, Create role, Create DB | {} postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+------------+------------+----------------------- postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 | repmgr | repmgr | UTF8 | en_US.utf8 | en_US.utf8 | template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres + | | | | | postgres=CTc/postgres (4 rows) |
4.4、4个节点分别修改repmgr.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | -- 以pg13用户修改 su - pg13 -- 主库 cat > /pg13/pg13/repmgr.conf << "EOF" node_id=1 node_name=lhrrepmgr64361 conninfo='host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2' data_directory='/pg13/pgdata' pg_bindir='/pg13/pg13/bin' EOF -- 从库1 cat > /pg13/pg13/repmgr.conf << "EOF" node_id=2 node_name=lhrrepmgr64362 conninfo='host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2' data_directory='/pg13/pgdata' pg_bindir='/pg13/pg13/bin' EOF -- 从库2 cat > /pg13/pg13/repmgr.conf << "EOF" node_id=3 node_name=lhrrepmgr64363 conninfo='host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2' data_directory='/pg13/pgdata' pg_bindir='/pg13/pg13/bin' EOF -- witness节点 cat > /pg13/pg13/repmgr.conf << "EOF" node_id=4 node_name=lhrrepmgr64364 conninfo='host=172.72.6.64 user=repmgr password=lhr dbname=repmgr connect_timeout=2' data_directory='/pg13/pgdata' pg_bindir='/pg13/pg13/bin' EOF -- 测试 psql 'host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.64 user=repmgr password=lhr dbname=repmgr connect_timeout=2' |
4.5、在主库注册主库服务
1 2 3 4 5 6 | -- 注册服务 repmgr -f /pg13/pg13/repmgr.conf primary register -- 查询 repmgr -f /pg13/pg13/repmgr.conf cluster show psql -U repmgr -h 192.168.66.35 -p 64361 -d repmgr |
执行过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | [pg13@lhrrepmgr64361 pg13]$ repmgr -f /pg13/pg13/repmgr.conf primary register INFO: connecting to primary database... NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (ID: 1) registered [pg13@lhrrepmgr64361 pg13]$ [pg13@lhrrepmgr64361 pg13]$ repmgr -f /pg13/pg13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+----------------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------- 1 | lhrrepmgr64361 | primary | * running | | default | 100 | 1 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 C:\Users\lhrxxt>psql -U repmgr -h 192.168.66.35 -p 64361 -d repmgr Password for user repmgr: psql (13.2) Type "help" for help. repmgr=# select * from repmgr.nodes; node_id | upstream_node_id | active | node_name | type | location | priority | conninfo | repluser | slot_name | config_file ---------+------------------+--------+----------------+---------+----------+----------+---------------------------------------------------------------------------+----------+-----------+------------------------ 1 | | t | lhrrepmgr64361 | primary | default | 100 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 | repmgr | | /pg13/pg13/repmgr.conf (1 row) |
4.6、克隆备库1和备库2
4.6.1、4个节点都配~/.pgpass密码文件
1 2 3 4 5 6 7 8 9 10 | su - pg13 echo "*:*:*:repmgr:lhr" > ~/.pgpass chmod 0600 ~/.pgpass -- 测试,若不需要密码,那就证明配置正确 psql 'host=172.72.6.61 user=repmgr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.62 user=repmgr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.63 user=repmgr dbname=repmgr connect_timeout=2' psql 'host=172.72.6.64 user=repmgr dbname=repmgr connect_timeout=2' |
4.6.2、克隆备库1
1 2 3 4 5 6 7 8 9 10 11 12 | -- 其中--dry-run表示不真实执行克隆过程,只是检查是否有问题 repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force --dry-run -- 执行克隆命令,其实还是调用pg_basebackup repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force -- 启动 pg_ctl -D /pg13/pgdata start -- 查询 psql select * from pg_stat_wal_receiver; |
执行过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | [pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force --dry-run NOTICE: destination directory "/pg13/pgdata" provided INFO: connecting to source node DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr DETAIL: current installation size is 29 MB INFO: "repmgr" extension is installed in database "repmgr" WARNING: target data directory appears to be a PostgreSQL data directory DETAIL: target data directory is "/pg13/pgdata" HINT: use -F/--force to overwrite the existing data directory INFO: replication slot usage not requested; no replication slot will be set up for this standby INFO: parameter "max_wal_senders" set to 10 NOTICE: checking for available walsenders on the source node (2 required) INFO: sufficient walsenders available on the source node DETAIL: 2 required, 10 available NOTICE: checking replication connections can be made to the source server (2 required) INFO: required number of replication connections could be made to the source server DETAIL: 2 replication connections required WARNING: data checksums are not enabled and "wal_log_hints" is "off" DETAIL: pg_rewind requires "wal_log_hints" to be enabled NOTICE: standby will attach to upstream node 1 HINT: consider using the -c/--fast-checkpoint option INFO: all prerequisites for "standby clone" are met [pg13@lhrrepmgr64362 ~]$ repmgrrepmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone^C [pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone NOTICE: destination directory "/pg13/pgdata" provided INFO: connecting to source node DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr DETAIL: current installation size is 29 MB ERROR: target data directory appears to be a PostgreSQL data directory DETAIL: target data directory is "/pg13/pgdata" HINT: use -F/--force to overwrite the existing data directory [pg13@lhrrepmgr64362 ~]$ repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force NOTICE: destination directory "/pg13/pgdata" provided INFO: connecting to source node DETAIL: connection string is: host=172.72.6.61 user=repmgr dbname=repmgr DETAIL: current installation size is 29 MB INFO: replication slot usage not requested; no replication slot will be set up for this standby NOTICE: checking for available walsenders on the source node (2 required) NOTICE: checking replication connections can be made to the source server (2 required) WARNING: data checksums are not enabled and "wal_log_hints" is "off" DETAIL: pg_rewind requires "wal_log_hints" to be enabled WARNING: directory "/pg13/pgdata" exists but is not empty NOTICE: -F/--force provided - deleting existing data directory "/pg13/pgdata" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /pg13/pg13/bin/pg_basebackup -l "repmgr base backup" -D /pg13/pgdata -h 172.72.6.61 -p 5432 -U repmgr -X stream Password: WARNING: skipping special file "./.s.PGSQL.5432" WARNING: skipping special file "./.s.PGSQL.5432" NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /pg13/pgdata start HINT: after starting the server, you need to register this standby with "repmgr standby register" [pg13@lhrrepmgr64362 ~]$ pg_ctl -D /pg13/pgdata start waiting for server to start....2021-04-27 19:14:08.750 CST [9298] LOG: redirecting log output to logging collector process 2021-04-27 19:14:08.750 CST [9298] HINT: Future log output will appear in directory "pg_log". done server started [pg13@lhrrepmgr64362 ~]$ psql psql (13.2) Type "help" for help. postgres=# select * from pg_stat_wal_receiver; pid | status | receive_start_lsn | receive_start_tli | written_lsn | flushed_lsn | received_tli | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time | slot_name | sender_host | sender_port | conninfo ------+-----------+-------------------+-------------------+-------------+-------------+--------------+-------------------------------+-------------------------------+----------------+-------------------------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 9304 | streaming | 0/3000000 | 1 | 0/30001F0 | 0/30001F0 | 1 | 2021-04-27 19:15:08.862599+08 | 2021-04-27 19:15:08.862719+08 | 0/30001F0 | 2021-04-27 19:14:08.830865+08 | | 172.72.6.61 | 5432 | user=repmgr password=******** channel_binding=disable connect_timeout=2 dbname=replication host=172.72.6.61 port=5432 application_name=lhrrepmgr64362 fallback_application_name=walreceiver sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any (1 row) |
信息“NOTICE: standby clone (using pg_basebackup) complete”表示测试完成。
4.6.3、克隆备库2
1 2 3 4 5 6 7 8 9 10 | -- 其中--dry-run表示不真实执行克隆过程,只是检查是否有问题 repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force --dry-run repmgr -h 172.72.6.61 -U repmgr -d repmgr -f /pg13/pg13/repmgr.conf standby clone --force -- 启动 pg_ctl -D /pg13/pgdata start -- 查询 psql select * from pg_stat_wal_receiver; |
4.6.4、注册从库服务
1 2 3 | -- 分别在2个从库注册 repmgr -f /pg13/pg13/repmgr.conf standby register --force repmgr -f /pg13/pg13/repmgr.conf cluster show |
执行过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | -- 从库2 [pg13@lhrrepmgr64362 ~]$ repmgr -f /pg13/pg13/repmgr.conf standby register INFO: connecting to local node "lhrrepmgr64362" (ID: 2) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1) INFO: standby registration complete NOTICE: standby node "lhrrepmgr64362" (ID: 2) successfully registered [pg13@lhrrepmgr64362 ~]$ repmgr -f /pg13/pg13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+----------------+---------+-----------+----------------+----------+----------+----------+--------------------------------------------------------------------------- 1 | lhrrepmgr64361 | primary | * running | | default | 100 | 1 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 2 | lhrrepmgr64362 | standby | running | lhrrepmgr64361 | default | 100 | 1 | host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2 -- 从库3 [pg13@lhrrepmgr64363 pgdata]$ repmgr -f /pg13/pg13/repmgr.conf standby register INFO: connecting to local node "lhrrepmgr64363" (ID: 3) repmgr -f /pg13/pg13/repmgr.conf cluster showINFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1) INFO: standby registration complete NOTICE: standby node "lhrrepmgr64363" (ID: 3) successfully registered [pg13@lhrrepmgr64363 pgdata]$ repmgr -f /pg13/pg13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+----------------+---------+-----------+----------------+----------+----------+----------+--------------------------------------------------------------------------- 1 | lhrrepmgr64361 | primary | * running | | default | 100 | 1 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 2 | lhrrepmgr64362 | standby | running | lhrrepmgr64361 | default | 100 | 1 | host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2 3 | lhrrepmgr64363 | standby | running | lhrrepmgr64361 | default | 100 | 1 | host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2 |
4.7、配置witness
4.7.1、初始化witness数据库
witness节点不能克隆,只能初始化数据库。
4.7.1.1、初始化并启动数据库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | -- 需要重新初始化环境 rm -rf /pg13/pgdata /pg13/pg13/bin/initdb -D /pg13/pgdata -E UTF8 --locale=en_US.utf8 -U postgres -- 修改参数postgresql.conf cat >> /pg13/pgdata/postgresql.conf <<"EOF" listen_addresses = '*' port=5432 unix_socket_directories='/pg13/pgdata' logging_collector = on log_directory = 'pg_log' log_filename = 'postgresql-%a.log' log_truncate_on_rotation = on # 归档参数 wal_level='replica' archive_mode='on' archive_command='test ! -f /pg13/archive/%f && cp %p /pg13/archive/%f' restore_command='cp /pg13/archive/%f %p' # 主从流复制 hot_standby=on max_wal_senders=10 wal_sender_timeout=60s wal_keep_size=16MB # 主从切换参数,启用PG数据库的复制槽,PG12不需要"use_replication_slots=true"这个参数了。 max_replication_slots=10 wal_log_hints=on # 自动切换 shared_preload_libraries ='repmgr' EOF -- 修改参数pg_hba.conf cat >> /pg13/pgdata/pg_hba.conf <<"EOF" # TYPE DATABASE USER ADDRESS METHOD local all all trust host all all 127.0.0.1/32 trust host all all 0.0.0.0/0 md5 host replication all 0.0.0.0/0 md5 local repmgr repmgr md5 host repmgr repmgr 127.0.0.1/32 md5 host repmgr repmgr 172.72.6.0/24 md5 local replication repmgr md5 host replication repmgr 127.0.0.1/32 md5 host replication repmgr 172.72.6.0/24 md5 EOF -- 启动数据库 /pg13/pg13/bin/pg_ctl -D /pg13/pgdata -l logfile start |
4.7.1.2、创建相关用户和数据库
1 2 3 4 5 6 | su - pg13 createuser -s repmgr createdb repmgr -O repmgr psql -h 127.0.0.1 -c "alter user repmgr with password 'lhr';" psql -h 127.0.0.1 -c "alter user repmgr set search_path to repmgr, \"\$user\",public;" |
4.7.2、注册为witness节点
1 2 | repmgr -f /pg13/pg13/repmgr.conf witness register -h 172.72.6.61 -U repmgr -d repmgr --force repmgr -f /pg13/pg13/repmgr.conf cluster show |
执行过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | [pg13@lhrrepmgr64364 ~]$ repmgr -f /pg13/pg13/repmgr.conf witness register -h 172.72.6.61 -U repmgr -d repmgr --force INFO: connecting to witness node "lhrrepmgr64364" (ID: 4) INFO: connecting to primary node NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed INFO: witness registration complete NOTICE: witness node "lhrrepmgr64364" (ID: 4) successfully registered -- 所有节点查询都可以获取如下的结果 [pg13@lhrrepmgr64364 ~]$ repmgr -f /pg13/pg13/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+----------------+---------+-----------+----------------+----------+----------+----------+--------------------------------------------------------------------------- 1 | lhrrepmgr64361 | primary | * running | | default | 100 | 1 | host=172.72.6.61 user=repmgr password=lhr dbname=repmgr connect_timeout=2 2 | lhrrepmgr64362 | standby | running | lhrrepmgr64361 | default | 100 | 1 | host=172.72.6.62 user=repmgr password=lhr dbname=repmgr connect_timeout=2 3 | lhrrepmgr64363 | standby | running | lhrrepmgr64361 | default | 100 | 1 | host=172.72.6.63 user=repmgr password=lhr dbname=repmgr connect_timeout=2 4 | lhrrepmgr64364 | witness | * running | lhrrepmgr64361 | default | 0 | n/a | host=172.72.6.64 user=repmgr password=lhr dbname=repmgr connect_timeout=2 repmgr=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time ------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+------------+------------+------------+-----------+-----------+------------+---------------+------------+------------------------------- 7270 | 16385 | repmgr | lhrrepmgr64362 | 172.72.6.62 | | 54710 | 2021-04-28 19:07:18.248872+08 | | streaming | 0/7DF703C0 | 0/7DF703C0 | 0/7DF703C0 | 0/7DF703C0 | | | | 0 | async | 2021-04-29 12:06:07.924676+08 7276 | 16385 | repmgr | lhrrepmgr64363 | 172.72.6.63 | | 54888 | 2021-04-28 19:07:18.678135+08 | | streaming | 0/7DF703C0 | 0/7DF703C0 | 0/7DF703C0 | 0/7DF703C0 | | | | 0 | async | 2021-04-29 12:06:07.956466+08 (2 rows) |