PG分布式之PGXL安装部署
简介
文档:https://www.postgres-xl.org/documentation/index.html
https://www.postgres-xl.org/overview/
https://wiki.postgresql.org/wiki/Postgres-XC
Postgres-XL是一款开源的PG集群软件,XL代表eXtensible Lattice,即可扩展的PG“格子”之意,以下简称PGXL。
官方称其既适合写操作压力较大的OLTP应用,又适合读操作为主的大数据应用。它的前身是Postgres-XC(简称PGXC),PGXC是在PG的基础上加入了集群功能,主要适用于OLTP应用。PGXL是在PGXC的基础上的升级产品,加入了一些适用于OLAP应用的特性,如 Massively Parallel Processing (MPP) 特性。
通俗的说PGXL的代码是包含PG代码,使用PGXL安装PG集群并不需要单独安装PG。这样带来的一个问题是无法随意选择任意版本的PG,好在PGXL跟进PG较及时,目前最新版本Postgres-XL 10R1,基于PG 10。
Postgres-XL是由多个PostgreSQL数据库集群组成的,但看起来是单个数据库集群一样。根据你的设计,每个表都可以在各个数据库之间进行复制或分发。
为了实现这一目标,Postgres-XL是由GTM,Coordinator和Datanode三部分组成。 GTM负责支持事务的ACID。 Datanode存储数据并处理SQL操作(只能操作自己存储的数据)。Coordinator分析来自应用程序的SQL操作,确定哪个Datanode包含数据,并将指令发送到正确的Datanode。
通常情况下,GTM应该安装在单独的服务器上,因为GTM要处理所有Coordinator和Datanode的事务需求。你可以配置GTM-Proxy(GTM代理)来分组同一服务器上运行的Coordinator和Datanode的请求和响应, GTM-Proxy减少了与GTM的交互次数和数据量。 GTM代理还还可以处理GTM故障。
在同一台服务器上同时部署Coordinator和Datanode通常是很好的做法,这样我们就不必担心两者之间的负载平衡,如果是复制表的话,不需要发送额外的网络请求就可以从本地拿到数据。你可以部署任意数量的服务器(Coordinator和Datanode同时运行)。Coordinator和Datanode都是PostgreSQL实例,你可能需要做些配置使它们避免资源冲突。例如为它们分配不同的工作目录和端口号是非常重要的。
Postgres-XL允许多个Coordinator单独从应用程序接受SQL指令,而不是集中的方式。写操作可以通过任何一个Coordinator来完成,没有任何区别。他们看起来就像是单一的数据库。Coordinator的职责是接受和分销SQL指令,查找哪些Datanodes存储相应的数据,可能需要将查询计划发送到适当的Datanodes,然后收集结果并将其返回给应用程序。
Coordinator不存储用户数据。它仅存储目录数据,用来确定如何处理SQL语句以及查找目标Datanodes等等。你不必过分担心Coordinator失败,当一个Coordinator失败时,你可以切换到另一个。
GTM可能发生单点故障(SPOF)。为了防止这种情况,你可以运行另一个GTM(GTM-Standby)来备份主GTM的状态。当主GTM失败时,GTM-Proxy可以随时切换到备用。
如上所述,Postgres-XL的Coordinator和Datanodes都是是PostgreSQL数据库。在数据库范畴,PostgreSQL使用客户端/服务器模型。 PostgreSQL会话包含如下两个服务:
server,服务端进程,管理数据库文件、接受客户端应用程序的连接,为client执行数据库操作。该进程称为postgres。
client,客户端,需要执行数据库操作。客户端应用程序多种多样:可以是文本工具,图形应用程序,访问数据库以显示网页的Web服务器或专门的数据库维护工具。一些客户端应用程序随PostgreSQL发行版提供;大多数是由用户开发的。
在典型的客户端/服务器应用程序中,客户端和服务器部署在不同的主机上。它们通过TCP / IP网络连接进行通信。需要注意的是,有些文件在客户端上可以访问,在数据库服务器上却不行(可能只是文件名不同)。
PostgreSQL服务器可以处理来自客户端的多个并发连接。为此,它为每个连接启动一个新进程。连接一旦建立,原始的postgres进程不会干预客户端和新的服务器进程之间的通信。主服务进程是始终运行的,等待客户端的连接,期间,有很多连接产生或消亡。
组件简介
- Global Transaction Monitor (GTM)
全局事务管理器,确保群集范围内的事务一致性。 GTM负责发放事务ID和快照作为其多版本并发控制的一部分。
集群可选地配置一个备用GTM,以改进可用性。此外,可以在协调器间配置代理GTM, 可用于改善可扩展性,减少GTM的通信量。 - GTM Standby
GTM的备节点,在pgxc,pgxl中,GTM控制所有的全局事务分配,如果出现问题,就会导致整个集群不可用,为了增加可用性,增加该备用节点。当GTM出现问题时,GTM Standby可以升级为GTM,保证集群正常工作。 - GTM-Proxy
GTM需要与所有的Coordinators通信,为了降低压力,可以在每个Coordinator机器上部署一个GTM-Proxy。 - Coordinator
协调员管理用户会话,并与GTM和数据节点进行交互。协调员解析,并计划查询,并给语句中的每一个组件发送下一个序列化的全局性计划。
为节省机器,通常此服务和数据节点部署在一起。 - Data Node
数据节点是数据实际存储的地方。数据的分布可以由DBA来配置。为了提高可用性,可以配置数据节点的热备以便进行故障转移准备。
总结:gtm是负责ACID的,保证分布式数据库全局事务一致性。得益于此,就算数据节点是分布的,但是你在主节点操作增删改查事务时,就如同只操作一个数据库一样简单。Coordinator是调度的,将操作指令发送到各个数据节点。datanodes是数据节点,分布式存储数据。
规划
准备三台Centos7服务器(或者虚拟机),版本为“CentOS Linux release 7.6.1810 (Core) ”,集群规划如下:
主机名 | IP | 角色 | 端口 | nodename | 数据目录 |
---|---|---|---|---|---|
lhrpgxl90 | 172.72.6.90 | GTM | 6666 | gtm | $PGHOME/data/gtm |
GTM Slave | 20001 | gtmSlave | $PGHOME/data/gtmSlave | ||
lhrpgxl91 | 172.72.6.91 | Coordinator | 5432 | coord1 | $PGHOME/data/coord |
Datanode | 5433 | datanode1 | $PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode1_slave | $PGHOME/data/dn_slave | ||
GTM Proxy | 6666 | gtm_pxy1 | $PGHOME/data/gtm_pxy | ||
lhrpgxl92 | 172.72.6.92 | Coordinator | 5432 | coord2 | $PGHOME/data/coord |
Datanode | 5433 | datanode2 | $PGHOME/data/dn_master | ||
Datanode Slave | 15433 | datanode2_slave | $PGHOME/data/dn_slave | ||
GTM Proxy | 6666 | gtm_pxy2 | $PGHOME/data/gtm_pxy |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | -- 网卡 docker network create --subnet=172.72.6.0/24 pg-network docker rm -f lhrpgxl90 docker run -d --name lhrpgxl90 -h lhrpgxl90 \ --net=pg-network --ip 172.72.6.90 \ -p 64390:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrcentos76:8.5 \ /usr/sbin/init docker rm -f lhrpgxl91 docker run -d --name lhrpgxl91 -h lhrpgxl91 \ --net=pg-network --ip 172.72.6.91 \ -p 64391:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrcentos76:8.5 \ /usr/sbin/init docker rm -f lhrpgxl92 docker run -d --name lhrpgxl92 -h lhrpgxl92 \ --net=pg-network --ip 172.72.6.92 \ -p 64392:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true lhrbest/lhrcentos76:8.5 \ /usr/sbin/init [root@docker35 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 198a42183f53 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 54 seconds ago Up 52 seconds 0.0.0.0:64392->5432/tcp, :::64392->5432/tcp lhrpgxl92 90ea3592000b lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 56 seconds ago Up 54 seconds 0.0.0.0:64391->5432/tcp, :::64391->5432/tcp lhrpgxl91 c11615c37160 lhrbest/lhrcentos76:8.5 "/usr/sbin/init" 58 seconds ago Up 56 seconds 0.0.0.0:64390->5432/tcp, :::64390->5432/tcp lhrpgxl90 |
环境准备
安装之前,需要先确保机器满足一些先决条件。
要运行pgxc_ctl的节点需要支持无密码ssh访问。
在所有机器上,正确设置PATH环境变量包含Postgres-XL数据文件,特别是在通过ssh运行命令时。
必须配置pg_hba.conf允许远程访问。 pgxc_ctl.conf配置文件中诸如coordPgHbaEntries和datanodePgHbaEntries都可能需要适当的更改。
配置防火墙和iptables使某些端口可以正常访问。
如果没有安装pgxc_ctl,可以从源代码编译并安装。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | yum install -y flex bison readline-devel zlib-devel openjade docbook-style-dsssl gcc make groupadd -g 5432 postgres useradd -u 5432 -g postgres postgres echo "postgres:lhr" | chpasswd mkdir -p /postgresxl chown -R postgres:postgres /postgresxl cat >> /home/postgres/.bashrc <<"EOF" export PGHOME=/postgresxl export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH export PATH=$PGHOME/bin:$PATH export PGUSER=postgres export PGXC_CTL_HOME=/postgresxl/bin EOF echo "postgres ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers |
下载安装
https://www.postgres-xl.org/download/
https://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summary
在3台主机都需要安装PGXC,文件大约300MB,如下:
1 2 3 4 5 6 7 8 9 10 11 | su - postgres git clone git://git.postgresql.org/git/postgres-xl.git cd postgres-xl ./configure --prefix=/postgresxl make -j4 sudo make install cd contrib make -j4 sudo make install chown -R postgres.postgres /postgresxl/ |
cortrib中有很多postgres很牛的工具,一般要装上。如ltree,uuid,postgres_fdw等等。
配置主节点可以无密码访问备节点
1 2 3 | ./sshUserSetup.sh -user postgres -hosts "lhrpgxl90 lhrpgxl91 lhrpgxl92" -advanced exverify -confirm sudo chmod 600 /home/postgres/.ssh/config |
集群配置
以下内容在lhrpgxl90上运行即可。
生成pgxc_ctl配置文件
1 2 3 4 5 6 7 8 9 10 11 12 | [postgres@lhrpgxl90 ~]$ pgxc_ctl prepare /usr/bin/bash Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. ERROR: File "/postgresxl/bin/pgxc_ctl.conf" not found or not a regular file. No such file or directory Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf Finished reading configuration. ******** PGXC_CTL START *************** Current directory: /postgresxl/bin [postgres@lhrpgxl90 ~]$ ll /postgresxl/bin/pgxc_ctl.conf -rw-rw-r-- 1 postgres postgres 17815 Feb 21 17:18 /postgresxl/bin/pgxc_ctl.conf |
配置pgxc_ctl.conf
在lhrpgxl90上运行即可。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | cat > /postgresxl/bin/pgxc_ctl.conf <<"EOF" pgxcInstallDir=$PGHOME pgxlDATA=$PGHOME/data pgxcOwner=postgres pgxcUser=postgres tmpDir=/tmp localTmpDir=$tmpDir #========================================================================================================================== #---- GTM Master --------------- gtmName=gtm gtmMasterServer=lhrpgxl90 gtmMasterPort=6666 gtmMasterDir=$pgxlDATA/gtm #---- Configuration gtmExtraConfig=none gtmMasterSpecificExtraConfig=none #---- GTM Slave配置信息 gtmSlave=y # Specify y if you configure GTM Slave. Otherwise, GTM slave will not be configured and # all the following variables will be reset. gtmSlaveName=gtmSlave gtmSlaveServer=lhrpgxl90 # value none means GTM slave is not available. Give none if you don't configure GTM Slave. gtmSlavePort=20001 # Not used if you don't configure GTM slave. gtmSlaveDir=$pgxlDATA/gtmSlave # Not used if you don't configure GTM slave. #---- Configuration gtmSlaveSpecificExtraConfig=none #========================================================================================================================== #---- GTM Proxy配置信息,最好每个数据节点配置一个 #---- GTM-Proxy Master ------- gtmProxyDir=$pgxlDATA/gtm_proxy gtmProxy=y gtmProxyNames=(gtm_pxy1 gtm_pxy2) gtmProxyServers=(lhrpgxl91 lhrpgxl92) gtmProxyPorts=(6666 6666) gtmProxyDirs=($gtmProxyDir $gtmProxyDir) #---- Configuration gtmPxyExtraConfig=none gtmPxySpecificExtraConfig=(none none) #========================================================================================================================== #---- Coordinators --------- coordMasterDir=$pgxlDATA/coord coordNames=(coord1 coord2) coordPorts=(5432 5432) poolerPorts=(6667 6667) coordPgHbaEntries=(0.0.0.0/0) coordMasterServers=(lhrpgxl91 lhrpgxl92) coordMasterDirs=($coordMasterDir $coordMasterDir) coordMaxWALsernder=0 coordMaxWALSenders=($coordMaxWALsernder $coordMaxWALsernder) coordSlave=n #========================================================================================================================== #---- Datanodes ---------- datanodeMasterDir=$pgxlDATA/dn_master primaryDatanode=lhrpgxl91 datanodeNames=(datanode1 datanode2) datanodePorts=(5433 5433) datanodePoolerPorts=(6668 6668) datanodePgHbaEntries=(0.0.0.0/0) datanodeMasterServers=(lhrpgxl91 lhrpgxl92) datanodeMasterDirs=($datanodeMasterDir $datanodeMasterDir) datanodeMaxWalSender=4 datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender) datanodeSlave=n #========================================================================================================================== EOF |
初始化集群
1 | pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all |
过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | [postgres@lhrpgxl90 ~]$ pgxc_ctl -c /postgresxl/bin/pgxc_ctl.conf init all /usr/bin/bash Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf Finished reading configuration. ******** PGXC_CTL START *************** Current directory: /postgresxl/bin Initialize GTM master The files belonging to this GTM system will be owned by user "postgres". This user must also own the server process. fixing permissions on existing directory /postgresxl/data/gtm ... ok creating configuration files ... ok creating control file ... ok Success. waiting for server to shut down.... done server stopped Done. Start GTM master server starting Initialize GTM slave The files belonging to this GTM system will be owned by user "postgres". This user must also own the server process. fixing permissions on existing directory /postgresxl/data/gtmSlave ... ok creating configuration files ... ok creating control file ... ok Success. Done. Start GTM slaveserver starting Done. Initialize all the gtm proxies. Initializing gtm proxy gtm_pxy1. Initializing gtm proxy gtm_pxy2. The files belonging to this GTM system will be owned by user "postgres". This user must also own the server process. fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok creating configuration files ... ok Success. The files belonging to this GTM system will be owned by user "postgres". This user must also own the server process. fixing permissions on existing directory /postgresxl/data/gtm_proxy ... ok creating configuration files ... ok Success. Done. Starting all the gtm proxies. Starting gtm proxy gtm_pxy1. Starting gtm proxy gtm_pxy2. server starting server starting Done. Initialize all the coordinator masters. Initialize coordinator master coord1. Initialize coordinator master coord2. The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /postgresxl/data/coord ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... creating cluster information ... ok syncing data to disk ... ok freezing database template0 ... ok freezing database template1 ... ok freezing database postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /postgresxl/data/coord ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... creating cluster information ... ok syncing data to disk ... ok freezing database template0 ... ok freezing database template1 ... ok freezing database postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. Done. Starting coordinator master. Starting coordinator master coord1 Starting coordinator master coord2 2022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv4 address "0.0.0.0", port 5432 2022-02-22 15:22:42.142 CST [23303] LOG: listening on IPv6 address "::", port 5432 2022-02-22 15:22:42.203 CST [23303] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-02-22 15:22:42.291 CST [23304] LOG: database system was shut down at 2022-02-22 15:22:38 CST 2022-02-22 15:22:42.322 CST [23303] LOG: database system is ready to accept connections 2022-02-22 15:22:42.323 CST [23311] LOG: cluster monitor started 2022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv4 address "0.0.0.0", port 5432 2022-02-22 15:22:42.142 CST [23180] LOG: listening on IPv6 address "::", port 5432 2022-02-22 15:22:42.203 CST [23180] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-02-22 15:22:42.291 CST [23181] LOG: database system was shut down at 2022-02-22 15:22:38 CST 2022-02-22 15:22:42.322 CST [23180] LOG: database system is ready to accept connections 2022-02-22 15:22:42.323 CST [23188] LOG: cluster monitor started Done. Initialize all the datanode masters. Initialize the datanode master datanode1. Initialize the datanode master datanode2. The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /postgresxl/data/dn_master ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... creating cluster information ... ok syncing data to disk ... ok freezing database template0 ... ok freezing database template1 ... ok freezing database postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /postgresxl/data/dn_master ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... creating cluster information ... ok syncing data to disk ... ok freezing database template0 ... ok freezing database template1 ... ok freezing database postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. Done. Starting all the datanode masters. Starting datanode master datanode1. Starting datanode master datanode2. 2022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv4 address "0.0.0.0", port 5433 2022-02-22 15:22:50.478 CST [23764] LOG: listening on IPv6 address "::", port 5433 2022-02-22 15:22:50.555 CST [23764] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-02-22 15:22:50.665 CST [23764] LOG: redirecting log output to logging collector process 2022-02-22 15:22:50.665 CST [23764] HINT: Future log output will appear in directory "pg_log". 2022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv4 address "0.0.0.0", port 5433 2022-02-22 15:22:50.478 CST [23641] LOG: listening on IPv6 address "::", port 5433 2022-02-22 15:22:50.522 CST [23641] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-02-22 15:22:50.625 CST [23641] LOG: redirecting log output to logging collector process 2022-02-22 15:22:50.625 CST [23641] HINT: Future log output will appear in directory "pg_log". Done. ALTER NODE coord1 WITH (HOST='lhrpgxl91', PORT=5432); ALTER NODE CREATE NODE coord2 WITH (TYPE='coordinator', HOST='lhrpgxl92', PORT=5432); CREATE NODE CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433, PREFERRED); CREATE NODE CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433); CREATE NODE SELECT pgxc_pool_reload(); pgxc_pool_reload ------------------ t (1 row) CREATE NODE coord1 WITH (TYPE='coordinator', HOST='lhrpgxl91', PORT=5432); CREATE NODE ALTER NODE coord2 WITH (HOST='lhrpgxl92', PORT=5432); ALTER NODE CREATE NODE datanode1 WITH (TYPE='datanode', HOST='lhrpgxl91', PORT=5433); CREATE NODE CREATE NODE datanode2 WITH (TYPE='datanode', HOST='lhrpgxl92', PORT=5433, PREFERRED); CREATE NODE SELECT pgxc_pool_reload(); pgxc_pool_reload ------------------ t (1 row) Done. EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode1) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode1) 'ALTER NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode1) 'CREATE NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode1) 'SELECT pgxc_pool_reload()'; pgxc_pool_reload ------------------ t (1 row) EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord1 WITH (TYPE=''coordinator'', HOST=''lhrpgxl91'', PORT=5432)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode2) 'CREATE NODE coord2 WITH (TYPE=''coordinator'', HOST=''lhrpgxl92'', PORT=5432)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode2) 'CREATE NODE datanode1 WITH (TYPE=''datanode'', HOST=''lhrpgxl91'', PORT=5433, PREFERRED)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode2) 'ALTER NODE datanode2 WITH (TYPE=''datanode'', HOST=''lhrpgxl92'', PORT=5433, PREFERRED)'; EXECUTE DIRECT EXECUTE DIRECT ON (datanode2) 'SELECT pgxc_pool_reload()'; pgxc_pool_reload ------------------ t (1 row) Done. [postgres@lhrpgxl90 ~]$ pgxc_ctl /usr/bin/bash Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf Finished reading configuration. ******** PGXC_CTL START *************** Current directory: /postgresxl/bin PGXC show config all ========= Postgres-XL configuration Common Info ======================== === Overall === Postgres-XL owner: postgres Postgres-XL user: postgres Postgres-XL install directory: /postgresxl pgxc_ctl home: /postgresxl/bin pgxc_ctl configuration file: /postgresxl/bin/pgxc_ctl.conf pgxc_ctl tmpDir: /tmp pgxc_ctl localTempDir: /tmp pgxc_ctl log file: /home/postgres/pgxc_ctl/pgxc_log/24719_pgxc_ctl.log pgxc_ctl configBackup: n pgxc_ctl configBackupHost: none pgxc_ctl configBackupFile: none ========= Postgres-XL configuration End Common Info =================== ====== Server: lhrpgxl90 ======= GTM Master: Nodename: 'gtm', port: 6666, dir: '/postgresxl/data/gtm' ExtraConfig: 'none', Specific Extra Config: 'none' GTM Slave: Nodename: 'gtmSlave', port: 20001, dir: '/postgresxl/data/gtmSlave' ExtraConfig: 'none', Specific Extra Config: 'none' ====== Server: lhrpgxl91 ======= GTM Proxy: Nodename: 'gtm_pxy1', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none' Coordinator Master: Nodename: 'coord1', port: 5432, pooler port: 6667 MaxWalSenders: 0, Dir: '/postgresxl/data/coord' ExtraConfig: '(null)', Specific Extra Config: '(null)' pg_hba entries ( '0.0.0.0/0' ) Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)' Datanode Master: Nodename: 'datanode1', port: 5433, pooler port 6667 MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master' ExtraConfig: '(null)', Specific Extra Config: '(null)' pg_hba entries ( '0.0.0.0/0' ) Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)' ====== Server: lhrpgxl92 ======= GTM Proxy: Nodename: 'gtm_pxy2', port: 6666, dir: '/postgresxl/data/gtm_proxy' ExtraConfig: 'none', Specific Extra Config: 'none' Coordinator Master: Nodename: 'coord2', port: 5432, pooler port: 6667 MaxWalSenders: 0, Dir: '/postgresxl/data/coord' ExtraConfig: '(null)', Specific Extra Config: '(null)' pg_hba entries ( '0.0.0.0/0' ) Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)' Datanode Master: Nodename: 'datanode2', port: 5433, pooler port 6667 MaxWALSenders: 4, Dir: '/postgresxl/data/dn_master' ExtraConfig: '(null)', Specific Extra Config: '(null)' pg_hba entries ( '0.0.0.0/0' ) Extra pg_hba: '(null)', Specific Extra pg_hba: '(null)' PGXC monitor all Running: gtm master Running: gtm slave Running: gtm proxy gtm_pxy1 Running: gtm proxy gtm_pxy2 Running: coordinator master coord1 Running: coordinator master coord2 Running: datanode master datanode1 Running: datanode master datanode2 |
查看集群信息
在lhrpgxl91节点,执行psql -p 5432进入数据库操作。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | [root@lhrpgxl91 /]# su - postgres Last login: Mon Feb 21 17:11:45 CST 2022 on pts/0 [postgres@lhrpgxl91 ~]$ psql -p 5432 psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2)) Type "help" for help. postgres=# postgres=# select * from pgxc_node; node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id -----------+-----------+-----------+-----------+----------------+------------------+------------- coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643 coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633 datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358 datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925 (4 rows) -- node_type中的C代表coordinator,D代表DataNode postgres=# create database lhrdb; CREATE DATABASE postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- lhrdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres (4 rows) postgres=# \c lhrdb psql (14.0, server 10beta3 (Postgres-XL 10alpha2)) You are now connected to database "lhrdb" as user "postgres". lhrdb=# create table test1(id int,name text); CREATE TABLE lhrdb=# insert into test1(id,name) select generate_series(1,8),'test'; INSERT 0 8 lhrdb=# lhrdb=# select count(*) from test1; count ------- 8 (1 row) lhrdb=# SELECT xc_node_id, count(*) FROM test1 GROUP BY xc_node_id; xc_node_id | count ------------+------- -905831925 | 3 888802358 | 5 (2 rows) [postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2)) Type "help" for help. lhrdb=# select count(*) from test1; count ------- 5 (1 row) lhrdb=# [postgres@lhrpgxl91 ~]$ psql -p 5433 -d lhrdb -h lhrpgxl92 psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2)) Type "help" for help. lhrdb=# select count(*) from test1; count ------- 3 (1 row) |
注意:由于所有的数据节点组成了完整的数据视图,所以一个数据节点down机,整个pgxl都启动不了了,所以实际生产中,为了提高可用性,一定要配置数据节点的热备以便进行故障转移准备。
进程和端口信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | [postgres@lhrpgxl90 ~]$ netstat -tulnp | grep gtm (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 0 0 0.0.0.0:20001 0.0.0.0:* LISTEN 3002/gtm tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2885/gtm tcp6 0 0 :::20001 :::* LISTEN 3002/gtm tcp6 0 0 :::6666 :::* LISTEN 2885/gtm [postgres@lhrpgxl90 ~]$ ps -ef|grep gtm postgres 2885 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtm postgres 3002 1 0 10:00 ? 00:00:00 gtm -D /postgresxl/data/gtmSlave postgres 3291 485 0 10:02 pts/0 00:00:00 grep --color=auto gtm [postgres@lhrpgxl91 ~]$ netstat -tulnp | grep "gtm\|postgres" (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 0 0 0.0.0.0:6666 0.0.0.0:* LISTEN 2827/gtm_proxy tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 2929/postgres tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 3038/postgres tcp6 0 0 :::6666 :::* LISTEN 2827/gtm_proxy tcp6 0 0 :::5432 :::* LISTEN 2929/postgres tcp6 0 0 :::5433 :::* LISTEN 3038/postgres [postgres@lhrpgxl91 ~]$ ps -ef|grep postgres root 1749 295 0 09:46 pts/0 00:00:00 su - postgres postgres 1750 1749 0 09:46 pts/0 00:00:00 -bash postgres 2827 1 0 10:00 ? 00:00:00 gtm_proxy -D /postgresxl/data/gtm_proxy postgres 2929 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --coordinator -D /postgresxl/data/coord -i postgres 2931 2929 0 10:00 ? 00:00:00 postgres: pooler process postgres 2932 2929 0 10:00 ? 00:00:00 postgres: checkpointer process postgres 2933 2929 0 10:00 ? 00:00:00 postgres: writer process postgres 2934 2929 0 10:00 ? 00:00:00 postgres: wal writer process postgres 2935 2929 0 10:00 ? 00:00:00 postgres: autovacuum launcher process postgres 2936 2929 0 10:00 ? 00:00:00 postgres: stats collector process postgres 2937 2929 0 10:00 ? 00:00:00 postgres: cluster monitor process postgres 2938 2929 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher postgres 3038 1 0 10:00 ? 00:00:00 /postgresxl/bin/postgres --datanode -D /postgresxl/data/dn_master -i postgres 3039 3038 0 10:00 ? 00:00:00 postgres: logger process postgres 3042 3038 0 10:00 ? 00:00:00 postgres: pooler process postgres 3043 3038 0 10:00 ? 00:00:00 postgres: checkpointer process postgres 3044 3038 0 10:00 ? 00:00:00 postgres: writer process postgres 3045 3038 0 10:00 ? 00:00:00 postgres: wal writer process postgres 3046 3038 0 10:00 ? 00:00:00 postgres: autovacuum launcher process postgres 3047 3038 0 10:00 ? 00:00:00 postgres: stats collector process postgres 3048 3038 0 10:00 ? 00:00:00 postgres: cluster monitor process postgres 3049 3038 0 10:00 ? 00:00:00 postgres: bgworker: logical replication launcher |
建表说明
- REPLICATION表:各个datanode节点中,表的数据完全相同,也就是说,插入数据时,会分别在每个datanode节点插入相同数据。读数据时,只需要读任意一个datanode节点上的数据。
1 | lhrdb=# CREATE TABLE repltab (col1 int, col2 int) DISTRIBUTE BY REPLICATION; |
- DISTRIBUTE :会将插入的数据,按照拆分规则,分配到不同的datanode节点中存储,也就是sharding技术。每个datanode节点只保存了部分数据,通过coordinate节点可以查询完整的数据视图。
1 | lhrdb=# CREATE TABLE disttab(col1 int, col2 int, col3 text) DISTRIBUTE BY HASH(col1); |
模拟部分数据,插入测试数据:
1 2 3 4 5 6 | #任意登录一个coordinate节点进行建表操作 [postgres@lhrpgxl91 ~]$ psql -p 5432 lhrdb=# INSERT INTO disttab SELECT generate_series(1,100), generate_series(101, 200), 'foo'; INSERT 0 100 lhrdb=# INSERT INTO repltab SELECT generate_series(1,100), generate_series(101, 200); INSERT 0 100 |
查看数据分布结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | #DISTRIBUTE表分布结果 lhrdb=# SELECT xc_node_id, count(*) FROM disttab GROUP BY xc_node_id; xc_node_id | count ------------+------- -905831925 | 58 888802358 | 42 (2 rows) lhrdb=# select oid,* from pgxc_node; oid | node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred | node_id -------+-----------+-----------+-----------+-----------+----------------+------------------+------------- 11739 | coord1 | C | 5432 | lhrpgxl91 | f | f | 1885696643 16384 | coord2 | C | 5432 | lhrpgxl92 | f | f | -1197102633 16385 | datanode1 | D | 5433 | lhrpgxl91 | f | t | 888802358 16386 | datanode2 | D | 5433 | lhrpgxl92 | f | f | -905831925 (4 rows) #REPLICATION表分布结果 lhrdb=# SELECT xc_node_id, count(*) FROM repltab GROUP BY xc_node_id; xc_node_id | count -------------+------- -1151313560 | 100 (1 row) |
查看另一个datanode2中repltab表结果:
1 2 3 4 5 6 7 8 9 | [postgres@lhrpgxl92 ~]$ psql -p 5433 -d lhrdb psql (PGXL 10alpha2, based on PG 10beta3 (Postgres-XL 10alpha2)) Type "help" for help. lhrdb=# SELECT count(*) FROM repltab; count ------- 100 (1 row) |
结论:REPLICATION表中,datanode1,datanode2中表是全部数据,一模一样。而DISTRIBUTE表,数据散落近乎平均分配到了datanode1,datanode2节点中。
启动和关闭集群
以后启动,直接执行如下命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | [postgres@lhrpgxl90 ~]$ pgxc_ctl start all /usr/bin/bash Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf Finished reading configuration. ******** PGXC_CTL START *************** Current directory: /postgresxl/bin Start GTM master server starting Start GTM slaveserver starting Done. Starting all the gtm proxies. Starting gtm proxy gtm_pxy1. Starting gtm proxy gtm_pxy2. server starting server starting Done. Starting coordinator master. Starting coordinator master coord1 Starting coordinator master coord2 2022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv4 address "0.0.0.0", port 5432 2022-02-22 15:31:29.336 CST [24824] LOG: listening on IPv6 address "::", port 5432 2022-02-22 15:31:29.401 CST [24824] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-02-22 15:31:29.571 CST [24825] LOG: database system was shut down at 2022-02-22 15:31:00 CST 2022-02-22 15:31:29.617 CST [24824] LOG: database system is ready to accept connections 2022-02-22 15:31:29.618 CST [24832] LOG: cluster monitor started 2022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv4 address "0.0.0.0", port 5432 2022-02-22 15:31:29.336 CST [24697] LOG: listening on IPv6 address "::", port 5432 2022-02-22 15:31:29.401 CST [24697] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-02-22 15:31:29.571 CST [24698] LOG: database system was shut down at 2022-02-22 15:31:00 CST 2022-02-22 15:31:29.617 CST [24697] LOG: database system is ready to accept connections 2022-02-22 15:31:29.618 CST [24706] LOG: cluster monitor started Done. Starting all the datanode masters. Starting datanode master datanode1. Starting datanode master datanode2. 2022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv4 address "0.0.0.0", port 5433 2022-02-22 15:31:31.157 CST [24933] LOG: listening on IPv6 address "::", port 5433 2022-02-22 15:31:31.220 CST [24933] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-02-22 15:31:31.322 CST [24933] LOG: redirecting log output to logging collector process 2022-02-22 15:31:31.322 CST [24933] HINT: Future log output will appear in directory "pg_log". 2022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv4 address "0.0.0.0", port 5433 2022-02-22 15:31:31.157 CST [24807] LOG: listening on IPv6 address "::", port 5433 2022-02-22 15:31:31.220 CST [24807] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2022-02-22 15:31:31.322 CST [24807] LOG: redirecting log output to logging collector process 2022-02-22 15:31:31.322 CST [24807] HINT: Future log output will appear in directory "pg_log". Done. |
停止集群如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | [postgres@lhrpgxl90 ~]$ pgxc_ctl stop all /usr/bin/bash Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Installing pgxc_ctl_bash script as /postgresxl/bin/pgxc_ctl_bash. Reading configuration using /postgresxl/bin/pgxc_ctl_bash --home /postgresxl/bin --configuration /postgresxl/bin/pgxc_ctl.conf Finished reading configuration. ******** PGXC_CTL START *************** Current directory: /postgresxl/bin Stopping all the coordinator masters. Stopping coordinator master coord1. Stopping coordinator master coord2. Done. Stopping all the datanode masters. Stopping datanode master datanode1. Stopping datanode master datanode2. Done. Stopping all the gtm proxies. Stopping gtm proxy gtm_pxy1. Stopping gtm proxy gtm_pxy2. waiting for server to shut down.... done server stopped waiting for server to shut down.... done server stopped Done. Stop GTM slave waiting for server to shut down.... done server stopped Stop GTM master waiting for server to shut down.... done server stopped |
这几个主要命令暂时这么多,更多请从pgxc_ctl --help中获取更多信息。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [postgres@lhrpgxl90 ~]$ pgxc_ctl --help /usr/bin/bash pgxc_ctl [option ...] [command] option: -c or --configuration conf_file: Specify configruration file. -v or --verbose: Specify verbose output. -V or --version: Print version and exit. -l or --logdir log_directory: specifies what directory to write logs. -L or --logfile log_file: Specifies log file. --home home_direcotry: Specifies pgxc_ctl work director. -i or --infile input_file: Specifies inptut file. -o or --outfile output_file: Specifies output file. -h or --help: Prints this message and exits. For more deatils, refer to pgxc_ctl reference manual included in postgres-xc reference manual. |