在docker中模拟不同主机快速搭建GBase 8cV5集群环境
Tags: GBaseGBase 8c分布式数据库安装部署高可用
环境准备
申请环境
宿主机:32g内存,8g swap,需要保证每台机器至少4g内存+8g swap,否则不能安装。。。
IP | hostname | 角色 |
---|---|---|
172.72.3.30 | gbase8c_1 | gha_server(高可用服务)、dcs(分布式配置存储)、gtm(全局事务管理)、coordinator(协调器) |
172.72.3.31 | gbase8c_2 | datanode(数据节点) 1 |
172.72.3.32 | gbase8c_3 | datanode(数据节点) 2 |
名词 | 角色 | 功能 | 配置方式 |
---|---|---|---|
GHA Server | 高可用(highavailability)管理器 | 管理整个集群各节点的高可用状态,类似于patroni | 主备高可用架,主备之间可以配置同步或异步方式 |
DCS/HA Center | 集群状态管理器 | 存储各个节点的高可用状态,负责在故障情况下判断集群各个节点状态。 | 采用Raft的复制协议 |
GTM | 全局事务管理器(Global TransactionManager) | 负责生成并维护全局时间戳,保证集群数据一致性 | 主备高可用架构,主备之间可以配置同步或异步方式 |
CN/Coordinator | 协调器 | 对外提供接口,负责进行SQL解析和优化、生成执行计划,并协调数据节点进行数据查询和写入。 | 采用完全对等的部署方式 |
DN/Datanode | 数据节点 | 用于处理存储本节点相关的元数据以及所在的业务数据的分片。 | 主备高可用架构,主备之间可以配置同步或异步方式 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | -- 网卡 docker network create --subnet=172.72.0.0/16 lhrnw docker rm -f gbase8c_1 docker run -itd --name gbase8c_1 -h gbase8c_1 \ --net=lhrnw --ip 172.72.3.30 \ -p 63330:5432 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='gbase8c_1:172.72.3.30' \ --add-host='gbase8c_2:172.72.3.31' \ --add-host='gbase8c_3:172.72.3.32' \ lhrbest/lhrcentos76:9.0 \ /usr/sbin/init docker rm -f gbase8c_2 docker run -itd --name gbase8c_2 -h gbase8c_2 \ --net=lhrnw --ip 172.72.3.31 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='gbase8c_1:172.72.3.30' \ --add-host='gbase8c_2:172.72.3.31' \ --add-host='gbase8c_3:172.72.3.32' \ lhrbest/lhrcentos76:9.0 \ /usr/sbin/init docker rm -f gbase8c_3 docker run -itd --name gbase8c_3 -h gbase8c_3 \ --net=lhrnw --ip 172.72.3.32 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='gbase8c_1:172.72.3.30' \ --add-host='gbase8c_2:172.72.3.31' \ --add-host='gbase8c_3:172.72.3.32' \ lhrbest/lhrcentos76:9.0 \ /usr/sbin/init docker cp GBase8cV5_S3.0.0B76_centos7.8_x86_64.tar.gz gbase8c_1:/soft/ docker cp sshUserSetup.sh gbase8c_1:/soft/ |
环境配置
增加swap空间并调整swap配置
若内存为4g或更小,则需要增加swap空间,并增加swappiness参数,否则内存耗尽,系统会很卡,导致集群状态不对:
1 2 3 4 5 6 7 8 9 10 11 12 13 | dd if=/dev/zero of=/root/swapfile bs=10M count=3200 chmod -R 0600 /root/swapfile mkswap /root/swapfile swapon /root/swapfile echo '/root/swapfile swap swap defaults 0 0' >> /etc/fstab swapon -s echo 20 > /proc/sys/vm/swappiness echo 'vm.swappiness=20' >> /etc/sysctl.conf cat /proc/sys/vm/swappiness cat /etc/sysctl.conf | grep swappiness |
每台机器内存至少4G,且需要配置swap 8G,否则会报错“Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (4496 Mbytes) is larger.” 、“could not create shared memory segment: Cannot allocate memory”、“This error usually means that openGauss's request for a shared memory segment exceeded available memory or swap space, or exceeded your kernel's SHMALL parameter. You can either reduce the request size or reconfigure the kernel with larger SHMALL. To reduce the request size (currently 1841225728 bytes), reduce openGauss's shared memory usage, perhaps by reducing shared_buffers.”
1 | shared_buffers = 126MB |
也可以修改参数:
1 2 3 4 5 6 7 8 9 | sudo sed -i '/shared_buffers = 1GB/c shared_buffers = 256MB' /home/gbase/data/gtm/gtm1/postgresql.conf sudo sed -i '/shared_buffers = 1GB/c shared_buffers = 256MB' /home/gbase/data/coord/cn1/postgresql.conf sudo sed -i '/shared_buffers = 1GB/c shared_buffers = 256MB' /home/gbase/data/dn1/dn1_1/postgresql.conf sudo sed -i '/shared_buffers = 1GB/c shared_buffers = 256MB' /home/gbase/data/dn2/dn2_1/postgresql.conf sudo sed -i '/cstore_buffers = 1GB/c cstore_buffers = 32MB' /home/gbase/data/gtm/gtm1/postgresql.conf sudo sed -i '/cstore_buffers = 1GB/c cstore_buffers = 32MB' /home/gbase/data/coord/cn1/postgresql.conf sudo sed -i '/cstore_buffers = 1GB/c cstore_buffers = 32MB' /home/gbase/data/dn1/dn1_1/postgresql.conf sudo sed -i '/cstore_buffers = 1GB/c cstore_buffers = 32MB' /home/gbase/data/dn2/dn2_1/postgresql.conf |
所有节点安装依赖包
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | yum install -y libaio-devel flex bison ncurses-devel \ glibc-devel patch readline-devel bzip2 firewalld \ crontabs net-tools openssh-server openssh-clients which sshpass \ ntp chrony systemctl disable firewalld systemctl stop firewalld systemctl status firewalld -- 重启OS sed -i s/SELINUX=enforcing/SELINUX=disabled/g /etc/selinux/config setenforce 0 getenforce -- 安装 ntp 组件或 chronyd 组件,确保集群各个节点之间的时间同步 systemctl unmask ntpd systemctl enable ntpd systemctl start ntpd systemctl status ntpd systemctl status ntpd systemctl disable chronyd systemctl stop chronyd systemctl status chronyd |
修改主机名
注意修改三个节点的 IP 地址,这里我使用如下三个 IP,并分别修改 hostname。
1 2 3 | node 1: hostnamectl set-hostname gbase8c_1.lhr.com node 2: hostnamectl set-hostname gbase8c_2.lhr.com node 3: hostnamectl set-hostname gbase8c_3.lhr.com |
docker的话可以不用管。
修改内核参数
1 2 3 4 5 6 7 8 9 | cat >> /etc/sysctl.conf <<"EOF" kernel.shmmax = 4398046511104 kernel.shmmni = 4096 kernel.shmall = 4000000000 kernel.sem = 32000 1024000000 500 32000 EOF sysctl -p |
若kernel.shmmax配置过小,会报共享内存相关的错误:
FATAL: could not create shared memory segment: Invalid argument
DETAIL: Failed system call was shmget(key=6666001, size=4714683328, 03600).
HINT: This error usually means that openGauss's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 4714683328 bytes), reduce openGauss's shared memory usage, perhaps by reducing shared_buffers.
If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for.
The openGauss documentation contains more information about shared memory configuration.
所有节点创建用户
1 2 3 4 | useradd gbase echo "gbase:lhr" | chpasswd echo "gbase ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers |
配置互信
以root用户只在主节点操作:
1 2 3 4 | ./sshUserSetup.sh -user gbase -hosts "gbase8c_1 gbase8c_2 gbase8c_3" -advanced exverify –confirm -- 所有节点 chmod 600 /home/gbase/.ssh/config |
解压安装包
只在主节点操作,解压安装包 GBase8cV5_S3.0.0B76_centos7.8_x86_64.tar.gz
:
1 2 3 4 5 6 | su - gbase mkdir -p /home/gbase/gbase_package cp /soft/GBase8cV5_S3.0.0B76_centos7.8_x86_64.tar.gz /home/gbase/gbase_package cd /home/gbase/gbase_package tar -zxvf GBase8cV5_S3.0.0B76_centos7.8_x86_64.tar.gz tar zxf GBase8cV5_S3.0.0B76_CentOS_x86_64_om.tar.gz |
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | [gbase@gbase8c_1 gbase_package]$ ll -h total 514M drwxrwxr-x 5 gbase gbase 165 Feb 27 16:48 dependency -rw-r--r-- 1 root root 257M Mar 17 16:36 GBase8cV5_S3.0.0B76_centos7.8_x86_64.tar.gz -rw-rw-r-- 1 gbase gbase 65 Feb 27 16:48 GBase8cV5_S3.0.0B76_CentOS_x86_64_om.sha256 -rw-rw-r-- 1 gbase gbase 99M Feb 27 16:48 GBase8cV5_S3.0.0B76_CentOS_x86_64_om.tar.gz -rw-rw-r-- 1 gbase gbase 1012K Feb 27 16:48 GBase8cV5_S3.0.0B76_CentOS_x86_64_pgpool.tar.gz -rw-rw-r-- 1 gbase gbase 65 Feb 27 16:48 GBase8cV5_S3.0.0B76_CentOS_x86_64.sha256 -rw-rw-r-- 1 gbase gbase 158M Feb 27 16:48 GBase8cV5_S3.0.0B76_CentOS_x86_64.tar.bz2 -rw-rw-r-- 1 gbase gbase 2.6K Feb 27 16:48 gbase.yml drwxrwxr-x 11 gbase gbase 4.0K Feb 27 16:48 gha -rw-rw-r-- 1 gbase gbase 188 Feb 27 16:48 gha_ctl.ini drwxrwxr-x 2 gbase gbase 96 Feb 27 16:48 lib -rw-rw-r-- 1 gbase gbase 729 Feb 27 16:48 package_info.json drwxr-xr-x 4 gbase gbase 28 Mar 16 2021 python3.8 drwxrwxr-x 10 gbase gbase 4.0K Feb 27 16:48 script drwxrwxr-x 2 gbase gbase 330 Feb 27 16:48 simpleInstall -rw-rw-r-- 1 gbase gbase 118 Feb 27 16:48 ubuntu_version.json drwx------ 6 gbase gbase 87 Jul 2 2022 venv -rw-rw-r-- 1 gbase gbase 36 Feb 27 16:48 version.cfg [gbase@gbase8c_1 gbase_package]$ |
开始安装
只在主节点操作。
编辑集群部署文件 gbase8c.yml
1 2 | cd /home/gbase/gbase_package mv gbase.yml gbase.yml.bak |
- gbase.yml 修改如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | cat > /home/gbase/gbase_package/gbase.yml <<"EOF" gha_server: - gha_server1: host: 172.72.3.30 port: 20001 dcs: - host: 172.72.3.30 port: 2379 gtm: - gtm1: host: 172.72.3.30 agent_host: 172.72.3.30 role: primary port: 6666 agent_port: 8001 work_dir: /home/gbase/data/gtm/gtm1 coordinator: - cn1: host: 172.72.3.30 agent_host: 172.72.3.30 role: primary port: 5432 agent_port: 8003 work_dir: /home/gbase/data/coord/cn1 datanode: - dn1: - dn1_1: host: 172.72.3.31 agent_host: 172.72.3.31 role: primary port: 15432 agent_port: 8005 work_dir: /home/gbase/data/dn1/dn1_1 - dn2: - dn2_1: host: 172.72.3.32 agent_host: 172.72.3.32 role: primary port: 20010 agent_port: 8007 work_dir: /home/gbase/data/dn2/dn2_1 env: # cluster_type allowed values: multiple-nodes, single-inst, default is multiple-nodes cluster_type: multiple-nodes pkg_path: /home/gbase/gbase_package # 安装包所在路径 prefix: /home/gbase/gbase_db # 运行目录 version: V5_S3.0.0B76 # 与安装包版本一致 GBase8cV5_S3.0.0B76 user: gbase port: 22 # constant: # virtual_ip: 172.72.3.36/24 EOF |
执行安装脚本
1 2 3 | su - gbase cd /home/gbase/gbase_package/script ./gha_ctl install -c gbase -p /home/gbase/gbase_package -f |
注释:
-c 参数:数据库名称,默认 gbase
-p 参数:配置文件路径,默认 /home/gbase-f参数:若集群已存在,可以强制安装
安装日志位置
1 2 3 | tailf /tmp/gha_ctl/gha_ctl.log cd /home/gbase/gbase_db/log/om/ |
执行时间约 5 分钟,安装结束后,脚本会提示:
1 2 3 4 | { "ret":0, "msg":"Success" } |
集群安装成功!
状态检查
执行
1 2 | /home/gbase/gbase_package/script/gha_ctl monitor -l http://172.72.3.30:2379 /home/gbase/gbase_package/script/gha_ctl monitor -l http://172.72.3.30:2379 -H |
结果如下,说明集群安装正常,数据服务启动中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | [gbase@gbase8c_1 ~]$ /home/gbase/gbase_package/script/gha_ctl monitor -l http://172.72.3.30:2379 -H +----+-------------+-------------+-------+---------+--------+ | No | name | host | port | state | leader | +----+-------------+-------------+-------+---------+--------+ | 0 | gha_server1 | 172.72.3.30 | 20001 | running | True | +----+-------------+-------------+-------+---------+--------+ +----+------+-------------+------+---------------------------+---------+---------+ | No | name | host | port | work_dir | state | role | +----+------+-------------+------+---------------------------+---------+---------+ | 0 | gtm1 | 172.72.3.30 | 6666 | /home/gbase/data/gtm/gtm1 | running | primary | +----+------+-------------+------+---------------------------+---------+---------+ +----+------+-------------+------+----------------------------+---------+---------+ | No | name | host | port | work_dir | state | role | +----+------+-------------+------+----------------------------+---------+---------+ | 0 | cn1 | 172.72.3.30 | 5432 | /home/gbase/data/coord/cn1 | running | primary | +----+------+-------------+------+----------------------------+---------+---------+ +----+-------+-------+-------------+-------+----------------------------+---------+---------+ | No | group | name | host | port | work_dir | state | role | +----+-------+-------+-------------+-------+----------------------------+---------+---------+ | 0 | dn1 | dn1_1 | 172.72.3.31 | 15432 | /home/gbase/data/dn1/dn1_1 | running | primary | | 1 | dn2 | dn2_1 | 172.72.3.32 | 20010 | /home/gbase/data/dn2/dn2_1 | running | primary | +----+-------+-------+-------------+-------+----------------------------+---------+---------+ +----+-------------------------+--------+---------+----------+ | No | url | name | state | isLeader | +----+-------------------------+--------+---------+----------+ | 0 | http://172.72.3.30:2379 | node_0 | healthy | True | +----+-------------------------+--------+---------+----------+ [gbase@gbase8c_1 ~]$ /home/gbase/gbase_package/script/gha_ctl monitor -l http://172.72.3.30:2379 { "cluster": "gbase", "version": "V5_S3.0.0B76", "server": [ { "name": "gha_server1", "host": "172.72.3.30", "port": "20001", "state": "running", "isLeader": true } ], "gtm": [ { "name": "gtm1", "host": "172.72.3.30", "port": "6666", "workDir": "/home/gbase/data/gtm/gtm1", "agentPort": "8001", "state": "running", "role": "primary", "agentHost": "172.72.3.30" } ], "coordinator": [ { "name": "cn1", "host": "172.72.3.30", "port": "5432", "workDir": "/home/gbase/data/coord/cn1", "agentPort": "8003", "state": "running", "role": "primary", "agentHost": "172.72.3.30", "central": true } ], "datanode": { "dn1": [ { "name": "dn1_1", "host": "172.72.3.31", "port": "15432", "workDir": "/home/gbase/data/dn1/dn1_1", "agentPort": "8005", "state": "running", "role": "primary", "agentHost": "172.72.3.31" } ], "dn2": [ { "name": "dn2_1", "host": "172.72.3.32", "port": "20010", "workDir": "/home/gbase/data/dn2/dn2_1", "agentPort": "8007", "state": "running", "role": "primary", "agentHost": "172.72.3.32" } ] }, "dcs": { "clusterState": "healthy", "members": [ { "url": "http://172.72.3.30:2379", "id": "b337bb545c63ee23", "name": "node_0", "isLeader": true, "state": "healthy" } ] } } [gbase@gbase8c_1 ~]$ |
数据库启停
停止数据库服务
1 2 3 4 5 6 7 8 9 10 | su - gbase /home/gbase/gbase_package/script/gha_ctl stop all -l http://172.72.3.30:2379 -- 或 systemctl stop coordinator_gbase_cn1.service systemctl stop datanode_gbase_dn1_1.service systemctl stop gtm_gbase_gtm1.service systemctl stop server_gbase_gha_server1.service systemctl stop etcd.service |
启动数据库服务
1 2 3 4 | systemctl start etcd.service su - gbase /home/gbase/gbase_package/script/gha_ctl start all -l http://172.72.3.30:2379 |
连接和 SQL 测试
在主节点 gbase8c_1
执行 $ gsql -d postgres -p 5432
,出现 postgres=#
操作符说明客户端工具 gsql 成功连接 GBase 8c 数据库。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | [gbase@gbase8c_1 script]$ gsql -d postgres -p 5432 gsql ((multiple_nodes GBase8cV5 3.0.0B76 build 47948f99) compiled at 2023-02-27 16:04:20 commit 0 last mr 1232 ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help. testdb=# select version(); version --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 9.2.4 (multiple_nodes GBase8cV5 3.0.0B76 build 47948f99) compiled at 2023-02-27 16:04:20 commit 0 last mr 1232 on x86_64-unknown-linux-gnu, compiled by g++ (GCC) 7.3.0, 64-bit (1 row) testdb=# postgres=# create database testdb; CREATE DATABASE postgres=# \c testdb Non-SSL connection (SSL connection is recommended when requiring high-security) You are now connected to database "testdb" as user "gbase". testdb=# create table student(ID int, Name varchar(10)); CREATE TABLE testdb=# insert into student values(1, 'Mike'),(2,'John'); INSERT 0 2 testdb=# select * from student; id | name ----+------ 1 | Mike 2 | John (2 rows) testdb=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+-------+----------+---------+-------+------------------- postgres | gbase | UTF8 | C | C | template0 | gbase | UTF8 | C | C | =c/gbase + | | | | | gbase=CTc/gbase template1 | gbase | UTF8 | C | C | =c/gbase + | | | | | gbase=CTc/gbase testdb | gbase | UTF8 | C | C | (4 rows) testdb=# \d student Table "public.student" Column | Type | Modifiers --------+-----------------------+----------- id | integer | name | character varying(10) | testdb=# |
卸载集群
在主节点(172.72.3.30)执行以下命令
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | -- 1、停止所有节点的集群服务 gha_ctl stop all -l http://172.72.3.30:2379 -- 2、集群程序的卸载: gha_ctl uninstall -l http://172.72.3.30:2379 -- 3、移除 dcs 集群: cd /home/gbase/gbase_package/script ./gha_ctl destroy dcs -l http://172.72.3.30:2379 netstat -tulnp systemctl list-unit| grep gbase systemctl stop coordinator_gbase_cn1.service systemctl stop datanode_gbase_dn1_1.service systemctl stop gtm_gbase_gtm1.service systemctl stop server_gbase_gha_server1.service systemctl stop etcd.service systemctl disable coordinator_gbase_cn1.service systemctl disable datanode_gbase_dn1_1.service systemctl disable gtm_gbase_gtm1.service systemctl disable server_gbase_gha_server1.service systemctl disable etcd.service pkill -9 python3 pkill -9 gaussdb rm -rf /home/gbase/data rm -rf /home/gbase/gbase_db |
环境变量
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | [gbase@gbase8c_1 om]$ cat /home/gbase/.bashrc export GPHOME=/home/gbase/gbase_db/om export PATH=$GPHOME/script/gspylib/pssh/bin:$GPHOME/venv/bin:$GPHOME/script:$PATH export LD_LIBRARY_PATH=$GPHOME/lib:$LD_LIBRARY_PATH export PYTHONPATH=$GPHOME/lib export GAUSSHOME=/home/gbase/gbase_db/app export PATH=$GAUSSHOME/bin:$PATH export LD_LIBRARY_PATH=$GAUSSHOME/lib:$LD_LIBRARY_PATH export S3_CLIENT_CRT_FILE=$GAUSSHOME/lib/client.crt export GAUSS_VERSION=V5_S3.0.0B76 export PGHOST=/home/gbase/gbase_db/tmp export GAUSSLOG=/home/gbase/gbase_db/log umask 077 export GAUSS_ENV=2 export GAUSS_PSSH_PORT=22 export GS_CLUSTER_NAME=gbase |
安装脚本会自动帮我们配置,无需我们手动配置。
修改密码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | postgres=# select * from pg_roles; rolname | rolsuper | rolinherit | rolcreaterole | rolcreatedb | rolcatupdate | rolcanlogin | rolreplication | rolauditadmin | rolsystemadmin | rolconnlimit | rolpassword | rolvalidbegin | rolvaliduntil | rolrespool | rolparentid | roltabspace | rolconfig | oid | roluseft | rolkind | nodegroup | roltempspace | rolspillspace | rolmonitoradmin | roloperatoradmin | rolpolicyadmin --------------------------+----------+------------+---------------+-------------+--------------+-------------+----------------+---------------+----------------+--------------+-------------+---------------+---------------+--------------+-------------+-------------+-----------+-------+----------+---------+-----------+--------------+---------------+-----------------+------------------+---------------- gbase | t | t | t | t | t | t | t | t | t | -1 | ******** | | | default_pool | 0 | | | 10 | t | n | | | | t | t | t gs_role_directory_drop | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1059 | f | n | | | | f | f | f gs_role_directory_create | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1056 | f | n | | | | f | f | f gs_role_pldebugger | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1055 | f | n | | | | f | f | f gs_role_account_lock | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1048 | f | n | | | | f | f | f gs_role_replication | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1047 | f | n | | | | f | f | f gs_role_tablespace | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1046 | f | n | | | | f | f | f gs_role_signal_backend | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1045 | f | n | | | | f | f | f gs_role_copy_files | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1044 | f | n | | | | f | f | f gha | f | t | f | f | f | t | f | f | t | -1 | ******** | | | default_pool | 0 | | | 16965 | f | n | | | | f | f | f (10 rows) postgres=# alter system set password_policy=0; ALTER SYSTEM SET postgres=# alter system set password_encryption_type=1; ALTER SYSTEM SET postgres=# ALTER ROLE gbase PASSWORD 'lhr'; ALTER ROLE postgres=# ALTER ROLE gha PASSWORD 'lhr'; |
配置远程登录
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | cat >> /home/gbase/data/coord/cn1/postgresql.conf <<"EOF" listen_addresses = '*' EOF echo "host all all 0.0.0.0/0 md5" >> /home/gbase/data/coord/cn1/pg_hba.conf -- 初始化数据库的用户gbase,是不能通过IP远程连接的,所以需要创建另外一个用户才能远程连接 gsql -p 5432 -d postgres create user lhr with password 'lhr' sysadmin ; grant all PRIVILEGES to lhr; postgres=# select * from pg_roles; rolname | rolsuper | rolinherit | rolcreaterole | rolcreatedb | rolcatupdate | rolcanlogin | rolreplication | rolauditadmin | rolsystemadmin | rolconnlimit | rolpassword | rolvalidbegin | rolvaliduntil | rolrespool | rolparentid | roltabspace | rolconfig | oid | roluseft | rolkind | nodegroup | roltempspace | rolspillspace | rolmonitoradmin | roloperatoradmin | rolpolicyadmin --------------------------+----------+------------+---------------+-------------+--------------+-------------+----------------+---------------+----------------+--------------+-------------+---------------+---------------+--------------+-------------+-------------+-----------+-------+----------+---------+-----------+--------------+---------------+-----------------+------------------+---------------- gbase | t | t | t | t | t | t | t | t | t | -1 | ******** | | | default_pool | 0 | | | 10 | t | n | | | | t | t | t gs_role_directory_drop | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1059 | f | n | | | | f | f | f gs_role_directory_create | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1056 | f | n | | | | f | f | f gs_role_pldebugger | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1055 | f | n | | | | f | f | f gs_role_account_lock | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1048 | f | n | | | | f | f | f gs_role_replication | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1047 | f | n | | | | f | f | f gs_role_tablespace | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1046 | f | n | | | | f | f | f gs_role_signal_backend | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1045 | f | n | | | | f | f | f gs_role_copy_files | f | t | f | f | f | f | f | f | f | -1 | ******** | | | default_pool | 0 | | | 1044 | f | n | | | | f | f | f lhr | f | t | f | f | f | t | f | f | t | -1 | ******** | | | default_pool | 0 | | | 16975 | f | n | | | | f | f | f gha | f | t | f | f | f | t | f | f | t | -1 | ******** | | | default_pool | 0 | | | 16965 | f | n | | | | f | f | f (11 rows) postgres=# -- 重启集群后 远程登录 gsql -U gha -h 172.72.3.30 -p 5432 -d postgres C:\Users\lhr>psql -U gha -h 192.168.88.80 -p 5432 -d postgres Password for user gha: psql (14.0, server 9.2.4) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off) Type "help" for help. postgres=> \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+-------+----------+---------+-------+------------------- postgres | gbase | UTF8 | C | C | template0 | gbase | UTF8 | C | C | =c/gbase + | | | | | gbase=CTc/gbase template1 | gbase | UTF8 | C | C | =c/gbase + | | | | | gbase=CTc/gbase testdb | gbase | UTF8 | C | C | (4 rows) postgres=> |
dbeaver 客户端工具 连接数据库时,应先将以下数据库参数 进行修改:
1 2 3 | gs_guc reload -Z coordinator -N all -I all -h "host all all 0.0.0.0/0 sha256" gs_guc reload -Z coordinator -N all -I all -c "password_encryption_type=1" gs_guc reload -Z coordinator -N all -I all -c "listen_addresses = '*'" |
gbase 8c修改参数的命令
1 2 3 | gs_guc reload -Z coordinator -N all -I all -c "enable_gbase_mot=true" gs_guc reload -Z GTM -N all -I all -c "enable_gbase_mot=true" gs_guc reload -Z datanode -N all -I all -c "enable_gbase_mot=true" |
安装错误解决
gaussDB state is Coredump
原因:CPU不支持。。。
请检查是否含有avx、rdtscp指令集,若没有则不能安装。。。
1 2 | lscpu | grep avx lscpu | grep rdtscp |
AVX(Advanced Vector Extensions)是一组x86 CPU指令集扩展,用于加速向量和矩阵操作。AVX指令集在Intel Sandy Bridge微架构的CPU中首次引入,可用于执行单指令多数据(SIMD)计算。 AVX指令集扩展增加了256位和128位宽的SIMD指令,用于加速浮点运算、整数运算和逻辑运算。这些指令可用于执行并行计算,因为它们能够在一次操作中处理多个数据元素,而不是逐个处理。这使得AVX指令集非常适合执行科学计算、图形处理、媒体处理和数据分析等计算密集型任务。 AVX指令集扩展还提供了一些新的高级指令,例如FMA指令(Fused Multiply-Add),可以将一个乘法和一个加法操作结合起来,从而提高计算速度。另外,AVX指令集还增加了一些新的控制指令,用于支持更灵活的条件分支和循环结构。 总之,AVX指令集扩展可以大幅提高计算机系统的运算速度,特别是对于需要大量浮点运算和向量处理的应用程序来说。
rdtscp
是一个 x86 CPU 指令,它的作用是读取一个 64 位的时间戳计数器(TSC)的值,并且返回当前的处理器周期计数(即时钟周期数)和时间戳计数器的值。时间戳计数器是一个单调递增的计数器,它记录了 CPU 自从上电以来所经过的时钟周期数。它通常被用来测量代码执行时间、性能分析和同步等方面。而
rdtscp
指令可以在多核 CPU 环境下提供一致的计数器值,这对于跨核心的时间测量和调试是非常有用的。需要注意的是,
rdtscp
指令可能会受到时钟频率调整和 CPU 切换等因素的影响,因此在一些特殊情况下,它可能不够精确。
这个问题在opengauss、Gaussdb和GBase 8c中都存在。。。。。
不支持的CPU列表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | [root@lhrdb greenplum]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 16 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 15 Model name: Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz Stepping: 11 CPU MHz: 2593.912 BogoMIPS: 5187.82 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K NUMA node0 CPU(s): 0-15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl eagerfpu pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm rsb_ctxsw [root@lhrdb greenplum]# lscpu | grep Flags Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl eagerfpu pni ssse3 cx16 pcid sse4_2 x2apic hypervisor lahf_lm rsb_ctxsw [root@lhrdb greenplum]# |
支持的CPU列表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | [root@lhrxxt ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 141 Model name: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz Stepping: 1 CPU MHz: 2304.009 BogoMIPS: 4608.01 Virtualization: VT-x Hypervisor vendor: Microsoft Virtualization type: full L1d cache: 48K L1i cache: 32K L2 cache: 1280K L3 cache: 24576K Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect flush_l1d arch_capabilities [root@lhrxxt ~]# lscpu | grep Flags Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect flush_l1d arch_capabilities [root@lhrxxt ~]# [root@mdw ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz Stepping: 1 CPU MHz: 2397.223 BogoMIPS: 4794.44 Hypervisor vendor: VMware Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 25600K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 invpcid rtm rdseed adx smap xsaveopt arat [root@mdw ~]# |
解决:若是公司的虚拟化平台,那么请联系虚拟化平台管理员,让他协助解决。需要修改参数,然后重启服务器的。
安装完成后需要退出gbase用户,然后重新进入一次 gbase 用户,或者重新引入环境变量,否则无法找到二进制文件或者类包。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | [gbase@gbase8c_1 script]$ gsql -d postgres -p 5432 bash: gsql: command not found... [gbase@gbase8c_1 bin]$ ./gsql -d postgres -p 5432 ./gsql: error while loading shared libraries: libcjson.so.1: cannot open shared object file: No such file or directory [gbase@gbase8c_1 bin]$ ldd gsql ... libcjson.so.1 => not found libcurl.so.4 => /lib64/libcurl.so.4 (0x00007fccb7acb000) libgssapi_krb5_gauss.so.2 => not found libgssrpc_gauss.so.4 => not found libkrb5_gauss.so.3 => not found libkrb5support_gauss.so.0 => not found libk5crypto_gauss.so.3 => not found libcom_err_gauss.so.3 => not found libpq_ce.so.5 => not found --> ldconfig, source ~/.bashrc |
Current gtm center group num 1 is out of range [0, 0]
1 2 | [BACKEND] FATAL: Invalid attribute for gtm server. [BACKEND] DETAIL: Current gtm center group num 1 is out of range [0, 0]. |
原因:和底层的虚拟化环境有关系,因为我用的是wsl2环境,所以报错,若使用VMware workstation或其它环境安装不报错。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | Configuration is completed. Using gbase:gbase to install database. Using installation program path : /home/gbase/gbase_db/app_89583529 $GAUSSHOME points to /home/gbase/gbase_db/app_89583529, no need to create symbolic link. Traceback (most recent call last): File "/home/gbase/gbase_db/om/script/local/Install.py", line 784, in <module> functionDict[g_opts.action]() File "/home/gbase/gbase_db/om/script/local/Install.py", line 708, in startCluster gtm.start(self.time_out) File "/home/gbase/gbase_db/om_89583529/script/local/../gspylib/component/Kernel/Kernel.py", line 109, in start raise Exception(ErrorCode.GAUSS_516["GAUSS_51607"] % "instance" Exception: [GAUSS-51607] : Failed to start instance. Error: Please check the gs_ctl log for failure details. [2023-03-20 09:11:25.006][10270][][gs_ctl]: gs_ctl started,datadir is /home/gbase/data/gtm/gtm1 [2023-03-20 09:11:25.049][10270][][gs_ctl]: waiting for server to start... .0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 0 LOG: [Alarm Module]Host Name: lhrxxt 0 LOG: [Alarm Module]Host IP: 192.168.8.8 0 LOG: [Alarm Module]Cluster Name: gbase 0 LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57 0 WARNING: failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory. 0 WARNING: failed to parse feature control file: gaussdb.version. 0 WARNING: Failed to load the product control file, so gaussdb cannot distinguish product version. 0 LOG: bbox_dump_path is set to /mnt/wslg/dumps/ 2023-03-20 09:11:25.129 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm DB010 0 [REDO] LOG: Recovery parallelism, cpu count = 16, max = 4, actual = 4 2023-03-20 09:11:25.129 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm DB010 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4 gaussdb.state does not exist, and skipt setting since it is optional.2023-03-20 09:11:25.135 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 2023-03-20 09:11:25.135 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: [Alarm Module]Host Name: lhrxxt 2023-03-20 09:11:25.135 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: [Alarm Module]Host IP: 192.168.8.8 2023-03-20 09:11:25.135 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: [Alarm Module]Cluster Name: gbase 2023-03-20 09:11:25.135 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57 2023-03-20 09:11:25.141 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 00000 0 [BACKEND] LOG: loaded library "security_plugin" 2023-03-20 09:11:25.146 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets 2023-03-20 09:11:25.146 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets 2023-03-20 09:11:25.160 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm OP0A3 0 [BACKEND] FATAL: Invalid attribute for gtm server. 2023-03-20 09:11:25.160 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm OP0A3 0 [BACKEND] DETAIL: Current gtm center group num 1 is out of range [0, 0]. 2023-03-20 09:11:25.160 6417b2bd.1 [unknown] 140672756703680 [unknown] 0 gtm OP0A3 0 [BACKEND] BACKTRACELOG: tid[10273]'s backtrace: /home/gbase/gbase_db/app/bin/gaussdb(+0xf21792) [0x55a626f21792] /home/gbase/gbase_db/app/bin/gaussdb(_Z9errfinishiz+0x391) [0x55a626f179d1] /home/gbase/gbase_db/app/bin/gaussdb(_ZN18GtmServerControler9ParseAttrEv+0x2a3) [0x55a62759e193] /home/gbase/gbase_db/app/bin/gaussdb(_ZN18GtmServerControler16SetGtmServerInfoEv+0x5f) [0x55a62759f45f] /home/gbase/gbase_db/app/bin/gaussdb(_Z14PostmasterMainiPPc+0x172f) [0x55a62748c66f] /home/gbase/gbase_db/app/bin/gaussdb(main+0x3ef) [0x55a626a2906f] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff0e6e02555] /home/gbase/gbase_db/app/bin/gaussdb(+0xa9dea7) [0x55a626a9dea7] Use addr2line to get pretty function name and line [2023-03-20 09:11:26.050][10270][][gs_ctl]: waitpid 10273 failed, exitstatus is 256, ret is 2 [2023-03-20 09:11:26.050][10270][][gs_ctl]: stopped waiting [2023-03-20 09:11:26.050][10270][][gs_ctl]: could not start server Examine the log output. |
Rpc request failed:dn1_1 save node info
1 2 3 4 | { "ret":80000116, "msg":"Rpc request failed:dn1_1 save node info" } |
原因:系统资源不足,可以尝试增加内存,或减小shared_buffers参数,然后重启集群。
OPENSSL_1_1_1 not defined in file libcrypto.so.1.1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | /home/gbase/gbase_db/app/bin/gaussdb: relocation error: /home/gbase/gbase_db/app/bin/gaussdb: symbol EVP_sm4_ctr, version OPENSSL_1_1_1 not defined in file libcrypto.so.1.1 with link time reference no data was returned by command \"\"/home/gbase/gbase_db/app/bin/gaussdb\" -V\" The program \"gaussdb\" is needed by gs_initdb but was not found in the same directory as \"/home/gbase/gbase_db/app/bin/gs_initdb\". Check your installation. Exception ignored in: <function Pool.__del__ at 0x7fbf2be67af0> Traceback (most recent call last): File \"./lib/python3.8/multiprocessing/pool.py\", line 268, in __del__ File \"./lib/python3.8/multiprocessing/queues.py\", line 368, in put File \"./lib/python3.8/multiprocessing/connection.py\", line 200, in send_bytes File \"./lib/python3.8/multiprocessing/connection.py\", line 411, in _send_bytes File \"./lib/python3.8/multiprocessing/connection.py\", line 368, in _send OSError: [Errno 9] Bad file descriptor |
原因:openssl版本太高导致,需要使用yum安装回默认版本,请参考:https://www.xmmup.com/opensslshengji.html