PostgreSQL能设计出利用多 CPU 让查询更快的查询计划。这种特性被称为并行查询。由于现有实现的限制或者因为没有比连续查询计划更快的查询计划存在，很多查询并不能从并行查询获益。不过，对于那些可以从并行查询获益的查询来说，并行查询带来的速度提升是显著的。很多查询在使用并行查询时比之前快了超过两倍，有些查询是以前的四倍甚至更多的倍数。那些访问大量数据但只返回其中少数行给用户的查询最能从并行查询中获益。

postgresql 11 对parallel又有了进一步的加强。

并行创建btree索引(Parallel Index Builds)
使用共享hash table时可以并行执行hash join
单个选择如果不能并行化，则允许UNION并行运行每个SELECT
并行扫描分区表
允许 limit 传递给并行进程
允许并行进程使用索引扫描式减少返回结果
允许并行化单个计算查询、where子句聚合查询和目标列表中的函数
新加参数 parallel_leader_participation 控制执行计划中的领导者，默认启用。
并行执行CREATE TABLE … AS, CREATE MATERIALIZED VIEW, certain queries using UNION
并行hash join、并行顺序扫描在多并行进程下得到加强
在EXPLAIN中添加并行进程排序活动的报告

常用参数

postgres=# select *
postgres-#   from pg_settings ps
postgres-#  where 1=1
postgres-#    and ps.name in (
postgres(#          'force_parallel_mode',
postgres(#          'max_worker_processes',
postgres(#          'max_parallel_workers',
postgres(#          'max_parallel_maintenance_workers',
postgres(#          'max_parallel_workers_per_gather',
postgres(#          --'min_parallel_relation_size',-- add 9.6,remove from 10
postgres(#          'min_parallel_index_scan_size',
postgres(#          'min_parallel_table_scan_size',
postgres(#          'parallel_tuple_cost',
postgres(#          'parallel_setup_cost',
postgres(#          'parallel_leader_participation'
postgres(# )
postgres-# ;
               name               | setting | unit |                category                |                                             short_desc                                             |                                                                extra_desc                                                                 |  context   | vartype | source  | min_val |   max_val    |     enumvals     | boot_val | reset_val | sourcefile | sourceline | pending_restart 
----------------------------------+---------+------+----------------------------------------+----------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+------------+---------+---------+---------+--------------+------------------+----------+-----------+------------+------------+-----------------
 force_parallel_mode              | off     |      | Query Tuning / Other Planner Options   | Forces use of parallel query facilities.                                                           | If possible, run query using a parallel worker and with parallel restrictions.                                                            | user       | enum    | default |         |              | {off,on,regress} | off      | off       |            |            | f
 max_parallel_maintenance_workers | 2       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per maintenance operation.                           |                                                                                                                                           | user       | integer | default | 0       | 1024         |                  | 2        | 2         |            |            | f
 max_parallel_workers             | 8       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel workers that can be active at one time.                        |                                                                                                                                           | user       | integer | default | 0       | 1024         |                  | 8        | 8         |            |            | f
 max_parallel_workers_per_gather  | 2       |      | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node.                                   |                                                                                                                                           | user       | integer | default | 0       | 1024         |                  | 2        | 2         |            |            | f
 max_worker_processes             | 8       |      | Resource Usage / Asynchronous Behavior | Maximum number of concurrent worker processes.                                                     |                                                                                                                                           | postmaster | integer | default | 0       | 262143       |                  | 8        | 8         |            |            | f
 min_parallel_index_scan_size     | 64      | 8kB  | Query Tuning / Planner Cost Constants  | Sets the minimum amount of index data for a parallel scan.                                         | If the planner estimates that it will read a number of index pages too small to reach this limit, a parallel scan will not be considered. | user       | integer | default | 0       | 715827882    |                  | 64       | 64        |            |            | f
 min_parallel_table_scan_size     | 1024    | 8kB  | Query Tuning / Planner Cost Constants  | Sets the minimum amount of table data for a parallel scan.                                         | If the planner estimates that it will read a number of table pages too small to reach this limit, a parallel scan will not be considered. | user       | integer | default | 0       | 715827882    |                  | 1024     | 1024      |            |            | f
 parallel_leader_participation    | on      |      | Resource Usage / Asynchronous Behavior | Controls whether Gather and Gather Merge also run subplans.                                        | Should gather nodes also run subplans, or just gather tuples?                                                                             | user       | bool    | default |         |              |                  | on       | on        |            |            | f
 parallel_setup_cost              | 1000    |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of starting up worker processes for parallel query.        |                                                                                                                                           | user       | real    | default | 0       | 1.79769e+308 |                  | 1000     | 1000      |            |            | f
 parallel_tuple_cost              | 0.1     |      | Query Tuning / Planner Cost Constants  | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend. |                                                                                                                                           | user       | real    | default | 0       | 1.79769e+308 |                  | 0.1      | 0.1       |            |            | f
(10 rows)

postgres=# select *

postgres-# from pg_settings ps

postgres-# where 1=1

postgres-# and ps.name in (

postgres(# 'force_parallel_mode',

postgres(# 'max_worker_processes',

postgres(# 'max_parallel_workers',

postgres(# 'max_parallel_maintenance_workers',

postgres(# 'max_parallel_workers_per_gather',

postgres(# --'min_parallel_relation_size',-- add 9.6,remove from 10

postgres(# 'min_parallel_index_scan_size',

postgres(# 'min_parallel_table_scan_size',

postgres(# 'parallel_tuple_cost',

postgres(# 'parallel_setup_cost',

postgres(# 'parallel_leader_participation'

postgres(# )

postgres-# ;

----------------------------------+---------+------+----------------------------------------+----------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+------------+---------+---------+---------+--------------+------------------+----------+-----------+------------+------------+-----------------

force_parallel_mode | off | | Query Tuning / Other Planner Options | Forces use of parallel query facilities. | If possible, run query using a parallel worker and with parallel restrictions. | user | enum | default | | | {off,on,regress} | off | off | | | f

max_parallel_maintenance_workers | 2 | | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per maintenance operation. | | user | integer | default | 0 | 1024 | | 2 | 2 | | | f

max_parallel_workers | 8 | | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel workers that can be active at one time. | | user | integer | default | 0 | 1024 | | 8 | 8 | | | f

max_parallel_workers_per_gather | 2 | | Resource Usage / Asynchronous Behavior | Sets the maximum number of parallel processes per executor node. | | user | integer | default | 0 | 1024 | | 2 | 2 | | | f

max_worker_processes | 8 | | Resource Usage / Asynchronous Behavior | Maximum number of concurrent worker processes. | | postmaster | integer | default | 0 | 262143 | | 8 | 8 | | | f

min_parallel_index_scan_size | 64 | 8kB | Query Tuning / Planner Cost Constants | Sets the minimum amount of index data for a parallel scan. | If the planner estimates that it will read a number of index pages too small to reach this limit, a parallel scan will not be considered. | user | integer | default | 0 | 715827882 | | 64 | 64 | | | f

min_parallel_table_scan_size | 1024 | 8kB | Query Tuning / Planner Cost Constants | Sets the minimum amount of table data for a parallel scan. | If the planner estimates that it will read a number of table pages too small to reach this limit, a parallel scan will not be considered. | user | integer | default | 0 | 715827882 | | 1024 | 1024 | | | f

parallel_setup_cost | 1000 | | Query Tuning / Planner Cost Constants | Sets the planner's estimate of the cost of starting up worker processes for parallel query. | | user | real | default | 0 | 1.79769e+308 | | 1000 | 1000 | | | f

parallel_tuple_cost | 0.1 | | Query Tuning / Planner Cost Constants | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend. | | user | real | default | 0 | 1.79769e+308 | | 0.1 | 0.1 | | | f

(10 rows)

max_work_processes(integer)

这个设置是当前系统中支持的最大后台进程数，默认为8

如果备库设置，备库上此参数必须大于或等于主库上的此参数配置值

此参数调整，需要重启数据库生效

max_parallel_workers

设置系统支持的并行查询进程数，默认8

这个参数值不能高于max_worker_processes

调整这个参数，建议同时调整max_parallel_workers_per_gather参数值

max_parallel_workers_per_gather

设置允许启用的并行进程的进程数，默认2

设置成0，表示禁用并行进程。

设置单个Gather或Gather Merge节点能够启用的最大并行进程数，默认值为2，并行查询(Parallel Query)的并行度受此参数的影响，CREATE INDEX 命令的并行度不受此参数影响。

max_parallel_maintenance_workers

支持内建的过程使用并行的方式工作，例如建立索引，默认设置为2。

该参数设置维护命令(例如 CREATE INDEX) 命令允许的最大并行进程数，默认值为2。

parallel_setup_cost(floating point)

设置优化器启动并行进程的成本，默认为1000

parallel_tuple_cost(floating point)

设置优化器通过并行进程处理一行数据的成本，默认为0.1

min_parallel_table_scan_size(integer)

设置开启并行的条件之一，表占用空间小于此值将不会开启并行

并行顺序扫描场景下扫描的数据大小通常等于表大小，默认值8M

min_parallel_index_scan_size(integer)

设置开启并行的条件之一，实际上并行扫描不会扫描索引所有数据块

只是扫描索引相关数据块，默认值为512kb

force_parallel_mode(enum)

强制开启并行，一般作为测试目的

其中的关系应该是

max_work_processes > max_parallel_workers > max_parallel_workers_per_gather >= max_parallel_maintenance_workers

以上参数可能并不容易理解，进一步解释如下：

max_worker_processes 参数设置的是数据库允许的最大后台进程数，并行进程属于后台进程的一种；
max_parallel_workers 参数设置数据库允许的最大并行进程数，这个值小于或等于 max_worker_processes。
并行进程数设置分为两类，第一类是并行查询，并行查询的并行度由 max_parallel_workers_per_gather 参数控制，第二类是维护命令(例如 CREATE INDEX)，维护命令的并行度由 max_parallel_maintenance_workers 参数控制。
max_parallel_workers_per_gather+max_parallel_maintenance_workers值应小于或等于 max_parallel_workers。

配置示例

max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
parallel_tuple_cost = 0.1
parallel_setup_cost = 1000.0
min_parallel_table_scan_size = 8MB
min_parallel_index_scan_size = 512kB
force_parallel_mode = off

max_worker_processes = 16

max_parallel_workers_per_gather = 4

max_parallel_workers = 8

parallel_tuple_cost = 0.1

parallel_setup_cost = 1000.0

min_parallel_table_scan_size = 8MB

min_parallel_index_scan_size = 512kB

force_parallel_mode = off

并行查询如何工作

当优化器判断对于某一个特定的查询，并行查询是最快的执行策略时，优化器将创建一个查询计划。该计划包括一个 Gather或者Gather Merge节点。下面是一个简单的例子：

EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Gather  (cost=1000.00..217018.43 rows=1 width=97)
   Workers Planned: 2
   ->  Parallel Seq Scan on pgbench_accounts  (cost=0.00..216018.33 rows=1 width=97)
         Filter: (filler ~~ '%x%'::text)
(4 rows)

EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';

QUERY PLAN

-------------------------------------------------------------------------------------

Gather (cost=1000.00..217018.43 rows=1 width=97)

Workers Planned: 2

-> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width=97)

Filter: (filler ~~ '%x%'::text)

(4 rows)

本人提供Oracle(OCP、OCM)、MySQL(OCP)、PostgreSQL(PGCA、PGCE、PGCM)等数据库的培训和考证业务，私聊QQ646634621或微信db_bao，谢谢！

后续精彩内容已被小麦苗无情隐藏，请输入验证码解锁本站所有文章！

请先关注本站微信公众号，然后回复“验证码”，获取验证码。在微信里搜索“DB宝”或者“www_xmmup_com”或者微信扫描右侧二维码都可以关注本站微信公众号。

打赏赞(2)

标签： PG 优化并行

小麦苗

学习或考证，均可联系麦老师，请加微信db_bao或QQ646634621

发表回复取消回复

要发表评论，您必须先登录。

2024年 4月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

嘿，我是小麦，需要帮助随时找我哦。

18509239930
个人微信
DB宝
个人邮箱
点击加入QQ群
个人微店
回到顶部

合 PG中的并行

简介

常用参数

配置示例

并行查询如何工作

相关文章

您可能还喜欢...

发表回复取消回复

网站公告

网站寄语

本站其它工具

搜索本网站

标签云☁

网站日历

网站归档

网站分类

合 PG中的并行

简介

常用参数

配置示例

并行查询如何工作

相关文章

您可能还喜欢...

PG 11中的“^@”运算符

PG行级安全策略RLS

PG口令加密算法中的scram-sha-256

发表回复 取消回复

网站公告

网站寄语

本站其它工具

搜索本网站

标签云☁

网站日历

网站归档

网站分类

发表回复取消回复