“FTS: cannot establish libpq connection (content=0, dbid=11): could not fork new process for connection: Cannot allocate memory”或“FATAL: Out of memory. Failed on request of size 144 bytes. (context 'GPORCAmemory pool') ”或“ATAL: the database system is in recovery mode”，

若没有swap内存配置，会发生OOM，特别严重时会导致segment自动故障切换。

FTS: cannot establish libpq connection (content=0, dbid=11): could not fork new process for connection: Cannot allocate memory
The previous session was reset because its gang was disconnected (session id = 6072). The new session id = 109485
FATAL:  Out of memory.  Failed on request of size 144 bytes. (context 'GPORCAmemory pool')
FATAL:  the database system is in recovery mode
gang was lost due to cluster reconfiguration(cdbgang_async.c:97)
rejecting TCP connection to master using internalconnection protocol
Any temporary tables for this session have been dropped because the gang was disconnected (session id = 85341)
failed to acquire resources on one or more segments

FTS: cannot establish libpq connection (content=0, dbid=11): could not fork new process for connection: Cannot allocate memory

The previous session was reset because its gang was disconnected (session id = 6072). The new session id = 109485

FATAL: Out of memory. Failed on request of size 144 bytes. (context 'GPORCAmemory pool')

FATAL: the database system is in recovery mode

gang was lost due to cluster reconfiguration(cdbgang_async.c:97)

rejecting TCP connection to master using internalconnection protocol

Any temporary tables for this session have been dropped because the gang was disconnected (session id = 85341)

failed to acquire resources on one or more segments

模拟OOM错误：


docker rm -f gpdbtest
docker run -itd  --name gpdbtest -h gpdb6 \
-m 2GB   --memory-swap=2GB  \
-p 28180:28080 \
--privileged=true  lhrbest/greenplum:6.23.1  /usr/sbin/init

docker exec -it gpdbtest bash
su - gpadmin
gpstart -am

gpconfig -c gp_vmem_protect_limit -v 1024
gpstop -M fast -ar
gpcc start

http://192.168.88.162:28180

docker stats gpdbtest

psql -c 'drop database sbtest2;'
psql -c 'create database sbtest2;'
sysbench /usr/share/sysbench/oltp_common.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 \
--pgsql-user=gpadmin --pgsql-password=lhr --pgsql-db=sbtest2  \
--time=300 --table-size=2000000 --tables=10 --threads=10  \
--events=999999999   prepare

select * from gp_segment_configuration ;

docker rm -f gpdbtest

docker run -itd --name gpdbtest -h gpdb6 \

-m 2GB --memory-swap=2GB \

-p 28180:28080 \

--privileged=true lhrbest/greenplum:6.23.1 /usr/sbin/init

docker exec -it gpdbtest bash

su - gpadmin

gpstart -am

gpconfig -c gp_vmem_protect_limit -v 1024

gpstop -M fast -ar

gpcc start

http://192.168.88.162:28180

docker stats gpdbtest

psql -c 'drop database sbtest2;'

psql -c 'create database sbtest2;'

sysbench /usr/share/sysbench/oltp_common.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 \

--pgsql-user=gpadmin --pgsql-password=lhr --pgsql-db=sbtest2 \

--time=300 --table-size=2000000 --tables=10 --threads=10 \

--events=999999999 prepare

select * from gp_segment_configuration ;

情况2：最大进程数超限导致系统资源不足

最大进程数超限，此时，日志报错：

could not fork new process for connection: Resource temporarily unavailable

could not fork new process for connection: Resource temporarily unavailable
 (seg0 119.10.25.26:6000)

FATAL:  InitMotionLayerIPC: failed to create thread (ic_udpifc.c:1488)
DETAIL:  pthread_create() failed with err 11
 (seg11 19.10.25.26:7003)

could not fork new process for connection: Resource temporarily unavailable

(seg0 119.10.25.26:6000)

FATAL: InitMotionLayerIPC: failed to create thread (ic_udpifc.c:1488)

DETAIL: pthread_create() failed with err 11

(seg11 19.10.25.26:7003)

该报错，多半是因为内核参数没有做正确修改，修复如下：


ll  /lib64/security/pam_limits.so
echo "session required /lib64/security/pam_limits.so" >> /etc/pam.d/login

cat >> /etc/security/limits.conf <<"EOF"

* soft nofile 655350 
* hard nofile 655350 
* soft nproc 655350
* hard nproc 655350

gpadmin soft priority -20

EOF

sed  -i 's/4096/655350/' /etc/security/limits.d/20-nproc.conf 
cat /etc/security/limits.d/20-nproc.conf

cat >> /etc/sysctl.conf <<"EOF"
fs.file-max=9000000
fs.inotify.max_user_instances = 1000000
fs.inotify.max_user_watches = 1000000
kernel.pid_max=4194304
EOF

sysctl -p

ll /lib64/security/pam_limits.so

echo "session required /lib64/security/pam_limits.so" >> /etc/pam.d/login

cat >> /etc/security/limits.conf <<"EOF"

* soft nofile 655350

* hard nofile 655350

* soft nproc 655350

* hard nproc 655350

gpadmin soft priority -20

EOF

sed -i 's/4096/655350/' /etc/security/limits.d/20-nproc.conf

cat /etc/security/limits.d/20-nproc.conf

cat >> /etc/sysctl.conf <<"EOF"

fs.file-max=9000000

fs.inotify.max_user_instances = 1000000

fs.inotify.max_user_watches = 1000000

kernel.pid_max=4194304

EOF

sysctl -p

本人提供Oracle(OCP、OCM)、MySQL(OCP)、PostgreSQL(PGCA、PGCE、PGCM)等数据库的培训和考证业务，私聊QQ646634621或微信db_bao，谢谢！

后续精彩内容已被小麦苗无情隐藏，请输入验证码解锁本站所有文章！

请先关注本站微信公众号，然后回复“验证码”，获取验证码。在微信里搜索“DB宝”或者“www_xmmup_com”或者微信扫描右侧二维码都可以关注本站微信公众号。

打赏赞(2)

标签：原创故障处理 GreenPlum segment OOM 自动故障转移实例漂移

小麦苗

学习或考证，均可联系麦老师，请加微信db_bao或QQ646634621

发表回复取消回复

要发表评论，您必须先登录。

2024年 4月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

嘿，我是小麦，需要帮助随时找我哦。

18509239930
个人微信
DB宝
个人邮箱
点击加入QQ群
个人微店
回到顶部

原 GreenPlum在何时会发生自动故障切换、节点漂移及如何排查（OOM）

简介

情况1（大部分情况）：发生了OOM

情况2：最大进程数超限导致系统资源不足

相关文章

您可能还喜欢...

发表回复取消回复

网站公告

网站寄语

本站其它工具

搜索本网站

标签云☁

网站日历

网站归档

网站分类

原 GreenPlum在何时会发生自动故障切换、节点漂移及如何排查（OOM）

简介

情况1（大部分情况）：发生了OOM

情况2：最大进程数超限导致系统资源不足

相关文章

您可能还喜欢...

在执行批处理时出现错误。错误消息为 引发类型为“System.OutOfMemoryException”的异常

使用Promethues+Grafana对Greenplum数据库监控

ORA-39113 Unable to determine database version

发表回复 取消回复

网站公告

网站寄语

本站其它工具

搜索本网站

标签云☁

网站日历

网站归档

网站分类

在执行批处理时出现错误。错误消息为引发类型为“System.OutOfMemoryException”的异常

发表回复取消回复