PG 14 + Pgpool-II + Watchdog 实现高可用

0    564    2

Tags:

👉 本文共约10919个字,系统预计阅读时间或需42分钟。

参考:https://www.pgpool.net/docs/latest/en/html/example-cluster.html

Pgpool-II + Watchdog配置示例

This section shows an example of streaming replication configuration using Pgpool-II. In this example, we use 3 Pgpool-II servers to manage PostgreSQL servers to create a robust cluster system and avoid the single point of failure or split brain.

PostgreSQL 14 is used in this configuration example. All scripts have been tested with PostgreSQL 10 and later.

1. Requirements

We assume that all the Pgpool-II servers and the PostgreSQL servers are in the same subnet.

2. Cluster System Configuration

We use 3 servers with CentOS 7.9 installed. Let these servers be server1 server2, server3. We install PostgreSQL and Pgpool-II on each server.

Figure 8-1. Cluster System Configuration

PG 14 + Pgpool-II + Watchdog 实现高可用

Note: The roles of Active, Standby, Primary, Standby are not fixed and may be changed by further operations.

Table 8-2. Hostname and IP address

HostnameIP AddressVirtual IP
server1192.168.137.101192.168.137.150
server2192.168.137.102192.168.137.150
server3192.168.137.103192.168.137.150

Table 8-3. PostgreSQL version and Configuration

ItemValueDetail
PostgreSQL Version14.0-
port5432-
$PGDATA/var/lib/pgsql/14/data-
Archive modeon/var/lib/pgsql/archivedir
Replication SlotsEnable-
Start automaticallyEnable-

Table 8-4. Pgpool-II version and Configuration

ItemValueDetail
Pgpool-II Version4.3.0-
port9999Pgpool-II accepts connections
9898PCP process accepts connections
9000watchdog accepts connections
9694UDP port for receiving Watchdog's heartbeat signal
Config file/etc/pgpool-II/pgpool.confPgpool-II config file
Pgpool-II start userpostgres (Pgpool-II 4.1 or later)Pgpool-II 4.0 or before, the default startup user is root
Running modestreaming replication mode-
WatchdogonLife check method: heartbeat
Start automaticallyEnable-

Table 8-5. Various sample scripts included in rpm package

FeatureScriptDetail
Failover/etc/pgpool-II/failover.sh.sampleRun by failover_command to perform failover
/etc/pgpool-II/follow_primary.sh.sampleRun by follow_primary_command to synchronize the Standby with the new Primary after failover.
Online recovery/etc/pgpool-II/recovery_1st_stage.sampleRun by recovery_1st_stage_command to recovery a Standby node
/etc/pgpool-II/pgpool_remote_start.sampleRun after recovery_1st_stage_command to start the Standby node
Watchdog/etc/pgpool-II/escalation.sh.sampleRun by wd_escalation_command to switch the Active/Standby Pgpool-II safely

The above scripts are included in the RPM package and can be customized as needed.

3. Installation

In this example, we install Pgpool-II and PostgreSQL RPM packages with YUM.

Install PostgreSQL from PostgreSQL YUM repository.

Since Pgpool-II related packages are also included in PostgreSQL YUM repository, add the "exclude" settings to /etc/yum.repos.d/pgdg-redhat-all.repo so that Pgpool-II is not installed from PostgreSQL YUM repository.

The following is a setting example of /etc/yum.repos.d/pgdg-redhat-all.repo.

Install Pgpool-II from Pgpool-II YUM repository.

4. Before Starting

Before you start the configuration process, please check the following prerequisites.

  • Set up PostgreSQL streaming replication on the primary server. In this example, we use WAL archiving.

    First, we create the directory /var/lib/pgsql/archivedir to store WAL segments on all servers. In this example, only Primary node archives WAL locally.

    Then we edit the configuration file $PGDATA/postgresql.conf on server1 (primary) as follows. Enable wal_log_hints to use pg_rewind. Since the Primary may become a Standby later, we set hot_standby = on.

    We use the online recovery functionality of Pgpool-II to setup standby server after the primary server is started.

  • Because of the security reasons, we create a user repl solely used for replication purpose, and a user pgpool for streaming replication delay check and health check of Pgpool-II.

    Table 8-6. Users

    User NamePasswordDetail
    replreplPostgreSQL replication user
    pgpoolpgpoolPgpool-II health check (health_check_user) and replication delay check (sr_check_user) user
    postgrespostgresUser running online recovery

    If you want to show "replication_state" and "replication_sync_state" column in SHOW POOL NODES command result, role pgpool needs to be PostgreSQL super user or or in pg_monitor group (Pgpool-II4.1 or later). Grant pg_monitor to pgpool:

    Note: If you plan to use detach_false_primary(Pgpool-II 4.0 or later), role "pgpool" needs to be PostgreSQL super user or or in "pg_monitor" group to use this feature.

    Assuming that all the Pgpool-II servers and the PostgreSQL servers are in the same subnet and edit pg_hba.conf to enable scram-sha-256 authentication method.

  • To use the automated failover and online recovery of Pgpool-II, the settings that allow passwordless SSH to all backend servers between Pgpool-II execution user (default root user) and postgresuser and between postgres user and postgres user are necessary. Execute the following command on all servers to set up passwordless SSH. The generated key file name is id_rsa_pgpool.

    本人提供Oracle、MySQL、PG等数据库的培训和考证业务,私聊QQ646634621或微信db_bao,谢谢!

    After setting SSH, use ssh postgres@serverX -i ~/.ssh/id_rsa_pgpool command to make sure that you can log in without entering a password. Edit /etc/ssh/sshd_config if necessary and restart sshd.

  • To allow repl user without specifying password for streaming replication and online recovery, and execute pg_rewind using postgres, we create the .pgpass file in postgres user's home directory and change the permission to 600 on each PostgreSQL server.

  • When connect to Pgpool-II and PostgreSQL servers, the target port must be accessible by enabling firewall management softwares. Following is an example for CentOS/RHEL7.

  • We set Pgpool-II to start automatically on all servers.

    Note: If you set the auto-start of Pgpool-II, you need to change the search_primary_node_timeout to an appropriate value that you can start the PostgreSQL after the server has been started. Pgpool-II will fail if it can't connect to thePostgreSQL on the backend during the search_primary_node_timeout.

5. Create pgpool_node_id

From Pgpool-II 4.2, now all configuration parameters are identical on all hosts. If watchdog feature is enabled, to distinguish which host is which, a pgpool_node_id file is required. You need to create a pgpool_node_id file and specify the pgpool (watchdog) node number (e.g. 0, 1, 2 ...) to identify pgpool (watchdog) host.

  • server1

  • server2

  • server3

6. Pgpool-II Configuration

When installing Pgpool-II using YUM, the Pgpool-II configuration file pgpool.conf is installed in /etc/pgpool-II.

Since from Pgpool-II 4.2, all configuration parameters are identical on all hosts, you can edit pgpool.conf on any pgpool node and copy the edited pgpool.conf file to the other pgpool nodes.

6.1. Clustering mode

Pgpool-II has several clustering modes. To set the clustering mode, backend_clustering_mode can be used. In this configuration example, streaming replication mode is used.

6.2. listen_addresses

To allow Pgpool-II to accept all incoming connections, we set listen_addresses = '*'.

6.3. port

Specify the port number Pgpool-II listen on.

6.4. Streaming Replication Check

Specify replication delay check user and password in sr_check_user and sr_check_password. In this example, we leave sr_check_password empty, and create the entry in pool_passwd. See Section 6.9 for how to create the entry in pool_passwd. From Pgpool-II 4.0, if these parameters are left blank, Pgpool-II will first try to get the password for that specific user from pool_passwd file before using the empty password.

6.5. Health Check

Enable health check so that Pgpool-II performs failover. Also, if the network is unstable, the health check fails even though the backend is running properly, failover or degenerate operation may occur. In order to prevent such incorrect detection of health check, we set health_check_max_retries = 3. Specify health_check_user and health_check_password in the same way like sr_check_user and sr_check_password.

6.6. Backend Settings

Specify the PostgreSQL backend information. Multiple backends can be specified by adding a number at the end of the parameter name.

To show "replication_state" and "replication_sync_state" column in SHOW POOL NODES command result, backend_application_name parameter is required. Here we specify each backend's hostname in these parameters. (Pgpool-II 4.1 or later)

6.7. Failover configuration

Specify failover.sh script to be executed after failover in failover_command parameter. If we use 3 PostgreSQL servers, we need to specify follow_primary_command to run after failover on the primary node failover. In case of two PostgreSQL servers, follow_primary_command setting is not necessary.

Pgpool-II replaces the following special characters with the backend specific information while executing the scripts. See failover_command for more details about each character.

Note: %N and %S are added in Pgpool-II 4.1. Please note that these characters cannot be specified if using Pgpool-II 4.0 or earlier.

Sample scripts failover.sh and follow_primary.sh are installed in /etc/pgpool-II/. Create failover scripts using these sample files.

Basically, it should work if you change PGHOME according to PostgreSQL installation directory.

Since user authentication is required to use the PCP command in follow_primary_command script, we need to specify user name and md5 encrypted password in pcp.conf in format "username:encrypted password".

if pgpool user is specified in PCP_USER in follow_primary.sh,

then we use pg_md5 to create the encrypted password entry for pgpool user as below:

Since follow_primary.sh script must execute PCP command without entering a password, we need to create .pcppass in the home directory of Pgpool-II startup user (postgres user) on each server.

Note: The follow_primary.sh script does not support tablespaces. If you are using tablespaces, you need to modify the script to support tablespaces.

6.8. Pgpool-II Online Recovery Configurations

Next, in order to perform online recovery with Pgpool-II we specify the PostgreSQL user name and online recovery command recovery_1st_stage. Because Superuser privilege in PostgreSQL is required for performing online recovery, we specify postgres user in recovery_user. Then, we create recovery_1st_stage and pgpool_remote_start in database cluster directory of PostgreSQL primary server (server1), and add execute permission.

Online recovery sample scriptsrecovery_1st_stage and pgpool_remote_start are installed in /etc/pgpool-II/. Copy these files to the data directory of the primary server (server1).

Basically, it should work if you change PGHOME according to PostgreSQL installation directory.

In order to use the online recovery functionality, the functions of pgpool_recovery, pgpool_remote_start, pgpool_switch_xlog are required, so we need to install pgpool_recovery on template1 of PostgreSQL server server1.

Note: The recovery_1st_stage script does not support tablespaces. If you are using tablespaces, you need to modify the script to support tablespaces.

6.9. Client Authentication Configuration

Because in the section Before Starting, we already set PostgreSQL authentication method to scram-sha-256, it is necessary to set a client authentication by Pgpool-II to connect to backend nodes. When installing with RPM, the Pgpool-II configuration file pool_hba.conf is in /etc/pgpool-II. By default, pool_hba authentication is disabled, set enable_pool_hba = on to enable it.

The format of pool_hba.conf file follows very closely PostgreSQL's pg_hba.conf format. Set pgpool and postgres user's authentication method to scram-sha-256.

Note: Please note that in Pgpool-II 4.0 only AES encrypted password or clear text password can be specified in health_check_password, sr_check_password,wd_lifecheck_password, recovery_password in pgpool.conf.

The default password file name for authentication is pool_passwd. To use scram-sha-256 authentication, the decryption key to decrypt the passwords is required. We create the .pgpoolkey file in Pgpool-II start user postgres's (Pgpool-II 4.1 or later) home directory. (Pgpool-II 4.0 or before, by default Pgpool-II is started as root)

Execute command pg_enc -m -k /path/to/.pgpoolkey -u username -p to register user name and AES encrypted password in file pool_passwd. If pool_passwd doesn't exist yet, it will be created in the same directory aspgpool.conf.

6.10. Watchdog Configuration

Enable watchdog functionality on server1, server2, server3.

Specify virtual IP address that accepts connections from clients on server1, server2, server3. Ensure that the IP address set to virtual IP isn't used yet.

To bring up/down the virtual IP and send the ARP requests, we set if_up_cmd, if_down_cmd and arping_cmd. The network interface used in this example is "enp0s8". Since root privilege is required to execute if_up/down_cmd or arping_cmd command, use setuid on these command or allow Pgpool-II startup user, postgres user (Pgpool-II 4.1 or later) to run sudo command without a password.

Note: If Pgpool-II is installed using RPM, the postgres user has been configured to run ip/arping via sudo without a password.

Here we configure the following parameters to run if_up/down_cmd or arping_cmd with sudo.

Note: If "Defaults requiretty" is set in the /etc/sudoers, please ensure that the pgpoolstartup user can execute the if_up_cmd, if_down_cmd and arping_cmd command without a tty.

Set if_cmd_path and arping_path according to the command path. If if_up/down_cmd or arping_cmd starts with "/", these parameters will be ignored.

Specify all Pgpool-II nodes information for configuring watchdog. Specify pgpool_portX using the port number specified in port in Section 6.3.

Specify the method of lifecheck wd_lifecheck_method and the lifecheck interval wd_interval. Here, we use heartbeat method to perform watchdog lifecheck.

Specify all Pgpool-II nodes information for sending and receiving heartbeat signal.

If the wd_lifecheck_method is set to heartbeat, specify the time to detect a fault wd_heartbeat_deadtime and the interval to send heartbeat signals wd_heartbeat_deadtime.

When Watchdog process is abnormally terminated, the virtual IP may be "up" on both of the old and new active pgpool nodes. To prevent this, configure wd_escalation_command to bring down the virtual IP on other pgpool nodes before bringing up the virtual IP on the new active pgpool node.

The sample script escalation.sh is installed in /etc/pgpool-II/.

Basically, it should work if you change the following variables according to your environment. PGPOOL is tha array of the hostname that running Pgpool-II. VIP is the virtual IP address that you set as delegate_IP. DEVICE is the network interface for the virtual IP.

Note: If you have even number of watchdog nodes, you need to turn on enable_consensus_with_half_votes parameter.

Note: If use_watchdog = on, please make sure the pgpool node number is specified in pgpool_node_id file. See Section 5 for details.

6.11. Logging

Since Pgpool-II 4.2, the logging collector process has been implemented. In the example, we enable logging collector.

Create the log directory on all servers.

The configuration of pgpool.conf on server1 is completed. Copy the pgpool.conf to other Pgpool-II nodes (server2 and server3).

7. /etc/sysconfig/pgpool Configuration

When starting Pgpool-II, if the pgpool_status file exists, Pgpool-II will read the backend status (up/down) from the pgpool_status file.

If you want to ignore the pgpool_status file at startup of Pgpool-II, add "- D" to the start option OPTS to /etc/sysconfig/pgpool.

8. Starting/Stopping Pgpool-II

Next we start Pgpool-II. Before starting Pgpool-II, please start PostgreSQL servers first. Also, when stopping PostgreSQL, it is necessary to stop Pgpool-II first.

  • Starting Pgpool-II

    In section Before Starting, we already set the auto-start of Pgpool-II. To start Pgpool-II, restart the whole system or execute the following command.

  • Stopping Pgpool-II

9. How to use

Let's start to use Pgpool-II. First, let's start Pgpool-II on server1, server2, server3 by using the following command.

9.1. Set up PostgreSQL standby server

First, we should set up PostgreSQL standby server by using Pgpool-II online recovery functionality. Ensure that recovery_1st_stage and pgpool_remote_start scripts used by pcp_recovery_node command are in database cluster directory of PostgreSQL primary server (server1).

After executing pcp_recovery_node command, verify that server2 and server3 are started as PostgreSQL standby server.

9.2. Switching active/standby watchdog

Confirm the watchdog status by using pcp_watchdog_info. The Pgpool-II server which is started first run as LEADER.

Stop active server server1, then server2 or server3 will be promoted to active server. To stop server1, we can stop Pgpool-II service or shutdown the whole system. Here, we stop Pgpool-II service.

Start Pgpool-II (server1) which we have stopped again, and verify that server1 runs as a standby.

9.3. Failover

First, use psql to connect to PostgreSQL via virtual IP, and verify the backend information.

Next, stop primary PostgreSQL server server1, and verify automatic failover.

After stopping PostgreSQL on server1, failover occurs and PostgreSQL on server2 becomes new primary DB.

server3 is running as standby of new primary server2.

9.4. Online Recovery

Here, we use Pgpool-II online recovery functionality to restore server1 (old primary server) as a standby. Before restoring the old primary server, please ensure that recovery_1st_stage and pgpool_remote_startscripts exist in database cluster directory of current primary server server2.

Then verify that server1 is started as a standby.

标签:

头像

小麦苗

学习或考证,均可联系麦老师,请加微信db_bao或QQ646634621

您可能还喜欢...

发表回复

嘿,我是小麦,需要帮助随时找我哦
  • 18509239930
  • 个人微信

  • 麦老师QQ聊天
  • 个人邮箱
  • 点击加入QQ群
  • 个人微店

  • 回到顶部
返回顶部