PG中的分区表
简介
http://postgres.cn/docs/13/ddl-partitioning.html
在数据库日渐庞大的今天,为了方便对数据库数据的管理,比如按时间,按地区去统计一些数据时,基数过于庞大,多有不便。很多商业数据库都提供分区的概念,按不同的维度去存放数据,便于后期的管理,PostgreSQL也不例外。
PostgresSQL分区的意思是把逻辑上的一个大表分割成物理上的几块儿。分区不仅能带来访问速度的提升,关键的是,它能带来管理和维护上的方便。
分区的具体好处是:
- 某些类型的查询性能可以得到极大提升。
- 更新的性能也可以得到提升,因为表的每块的索引要比在整个数据集上的索引要小。如果索引不能全部放在内存里,那么在索引上的读和写都会产生更多的磁盘访问。
- 批量删除可以用简单的删除某个分区来实现。
- 可以将很少用的数据移动到便宜的、转速慢的存储介质上。
PostgreSQL 有两种父、子表关系:分区(partition)和继承(inherit)。
在PG里,表分区是通过表继承来实现的,一般都是建立一个主表,里面是空,然后每个分区都去继承它。无论何时,都应保证主表里面是空的。 在PG 10之前,只能通过表继承来实现分区,从PG 10开始,可以通过DDL语句来直接创建分区表(内部原理也是通过继承来实现),这被称为声明式分区表(declaratively patitioned table)或内置分区表。PG 10仅支持范围分区和列表分区,尚未支持散列Hash分区,PG 11支持Hash分区。
小表分区不实际,表在多大情况下才考虑分区呢?PostgresSQL官方给出的建议是:当表本身大小超过了机器物理内存的实际大小时(the size of the table should exceed the physical memory of the database server),可以考虑分区。
分区表注意事项:
不支持全局的唯一、主键、排除、外键约束,只能在对应的分区建立这些约束。
索引只能在分区中创建,在主表创建不能继承到分区中。
更新数据时不能进行数据跨分区移动,否则会报错。
分区表继承特性的限制:
i.分区除了主表外,不能继承其他父表;
ii.一个普通表不能继承分区表主表。
传统方式实现分区
参考:https://www.xmmup.com/pgzhongdebiaojichengheonlyguanjianci.html
创建分区表
1 创建主表
1 2 3 4 5 6 | create table tbl_partition ( id integer, name varchar(20), gender boolean, join_date date, dept char(4)); |
2 创建分区表
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | create table tbl_partition_202011 ( check ( join_date >= DATE '2020-11-01' AND join_date < DATE '2020-12-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202012 ( check ( join_date >= DATE '2020-12-01' AND join_date < DATE '2021-01-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202101 ( check ( join_date >= DATE '2021-01-01' AND join_date < DATE '2021-02-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202102 ( check ( join_date >= DATE '2021-02-01' AND join_date < DATE '2021-03-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202103 ( check ( join_date >= DATE '2021-03-01' AND join_date < DATE '2021-04-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202104 ( check ( join_date >= DATE '2021-04-01' AND join_date < DATE '2021-05-01' ) ) INHERITS (tbl_partition); create table tbl_partition_202105 ( check ( join_date >= DATE '2021-05-01' AND join_date < DATE '2021-06-01' ) ) INHERITS (tbl_partition); |
查看分区:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | postgres=# \d List of relations Schema | Name | Type | Owner --------+----------------------+-------+---------- public | tbl_partition | table | postgres public | tbl_partition_202011 | table | postgres public | tbl_partition_202012 | table | postgres public | tbl_partition_202101 | table | postgres public | tbl_partition_202102 | table | postgres public | tbl_partition_202103 | table | postgres public | tbl_partition_202104 | table | postgres public | tbl_partition_202105 | table | postgres (8 rows) postgres=# \d tbl_partition Table "public.tbl_partition" Column | Type | Collation | Nullable | Default -----------+-----------------------+-----------+----------+--------- id | integer | | | name | character varying(20) | | | gender | boolean | | | join_date | date | | | dept | character(4) | | | Number of child tables: 7 (Use \d+ to list them.) postgres=# \d+ tbl_partition Table "public.tbl_partition" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description -----------+-----------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | name | character varying(20) | | | | extended | | gender | boolean | | | | plain | | join_date | date | | | | plain | | dept | character(4) | | | | extended | | Child tables: tbl_partition_202011, tbl_partition_202012, tbl_partition_202101, tbl_partition_202102, tbl_partition_202103, tbl_partition_202104, tbl_partition_202105 Access method: heap postgres=# \d tbl_partition_202011 Table "public.tbl_partition_202011" Column | Type | Collation | Nullable | Default -----------+-----------------------+-----------+----------+--------- id | integer | | | name | character varying(20) | | | gender | boolean | | | join_date | date | | | dept | character(4) | | | Check constraints: "tbl_partition_202011_join_date_check" CHECK (join_date >= '2020-11-01'::date AND join_date < '2020-12-01'::date) Inherits: tbl_partition |
3 分区键上建索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | create index tbl_partition_202011_joindate on tbl_partition_202011 (join_date); create index tbl_partition_202012_joindate on tbl_partition_202012 (join_date); create index tbl_partition_202101_joindate on tbl_partition_202101 (join_date); create index tbl_partition_202102_joindate on tbl_partition_202102 (join_date); create index tbl_partition_202103_joindate on tbl_partition_202103 (join_date); create index tbl_partition_202104_joindate on tbl_partition_202104 (join_date); create index tbl_partition_202105_joindate on tbl_partition_202105 (join_date); postgres=# \d tbl_partition_202011 Table "public.tbl_partition_202011" Column | Type | Collation | Nullable | Default -----------+-----------------------+-----------+----------+--------- id | integer | | | name | character varying(20) | | | gender | boolean | | | join_date | date | | | dept | character(4) | | | Indexes: "tbl_partition_202011_joindate" btree (join_date) Check constraints: "tbl_partition_202011_join_date_check" CHECK (join_date >= '2020-11-01'::date AND join_date < '2020-12-01'::date) Inherits: tbl_partition |
4 创建触发器函数
对于开发人员来说,希望数据库是透明的,只管 insert into tbl_partition。对于数据插向哪个分区,则希望由DB决定。这点,ORACLE实现了,但是PG不行,需要前期人工处理下。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | CREATE OR REPLACE FUNCTION tbl_partition_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.join_date >= DATE '2020-11-01' AND NEW.join_date < DATE '2020-12-01' ) THEN INSERT INTO tbl_partition_202011 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2020-12-01' AND NEW.join_date < DATE '2021-01-01' ) THEN INSERT INTO tbl_partition_202012 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2021-01-01' AND NEW.join_date < DATE '2021-02-01' ) THEN INSERT INTO tbl_partition_202101 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2021-02-01' AND NEW.join_date < DATE '2021-03-01' ) THEN INSERT INTO tbl_partition_202102 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2021-03-01' AND NEW.join_date < DATE '2021-04-01' ) THEN INSERT INTO tbl_partition_202103 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2021-04-01' AND NEW.join_date < DATE '2021-05-01' ) THEN INSERT INTO tbl_partition_202104 VALUES (NEW.*); ELSIF ( NEW.join_date >= DATE '2021-05-01' AND NEW.join_date < DATE '2021-06-01' ) THEN INSERT INTO tbl_partition_202105 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the tbl_partition_insert_trigger() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; |
说明:如果不想丢失数据,上面的ELSE 条件可以改成 INSERT INTO tbl_partition_error_join_date VALUES (NEW.*); 同时需要创建一张结构和tbl_partition 一样的表tbl_partition_error_join_date,这样,错误的join_date 数据就可以插入到这张表中而不是报错了。
5 创建触发器
1 2 3 4 | CREATE TRIGGER insert_tbl_partition_trigger BEFORE INSERT ON tbl_partition FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger(); |
查看分区表
注意:通过这种方式创建的分区表,视图pg_partitioned_table不显示数据。
1 查看所有表
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | \dt List of relations Schema | Name | Type | Owner --------+----------------------+-------+---------- public | tbl_partition | table | postgres public | tbl_partition_202011 | table | postgres public | tbl_partition_202012 | table | postgres public | tbl_partition_202101 | table | postgres public | tbl_partition_202102 | table | postgres public | tbl_partition_202103 | table | postgres public | tbl_partition_202104 | table | postgres public | tbl_partition_202105 | table | postgres (8 rows) |
2 查看主表
1 2 3 4 5 6 7 8 9 10 11 12 13 | \d tbl_partition Table "public.tbl_partition" Column | Type | Modifiers -----------+-----------------------+----------- id | integer | name | character varying(20) | gender | boolean | join_date | date | dept | character(4) | Triggers: insert_tbl_partition_trigger BEFORE INSERT ON tbl_partition FOR EACH ROW EXECUTE PROCEDURE tbl_partition_insert_trigger() Number of child tables: 7 (Use \d+ to list them.) |
3 查看分区表
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | \d tbl_partition_202104 Table "public.tbl_partition_202104" Column | Type | Modifiers -----------+-----------------------+----------- id | integer | name | character varying(20) | gender | boolean | join_date | date | dept | character(4) | Indexes: "tbl_partition_202104_joindate" btree (join_date) Check constraints: "tbl_partition_202104_join_date_check" CHECK (join_date >= '2021-04-01'::date AND join_date < '2021-05-01'::date) Inherits: tbl_partition |
测试
插入数据
1 2 3 4 5 6 7 8 9 10 11 12 13 | insert into tbl_partition values (1, 'David', '1', '2021-01-10', 'TS'); insert into tbl_partition values (2, 'Sandy', '0', '2021-02-10', 'TS'); insert into tbl_partition values (3, 'Eagle', '1', '2020-11-01', 'TS'); insert into tbl_partition values (4, 'Miles', '1', '2020-12-15', 'SD'); insert into tbl_partition values (5, 'Simon', '1', '2020-12-10', 'SD'); insert into tbl_partition values (6, 'Rock', '1', '2020-11-10', 'SD'); insert into tbl_partition values (7, 'Peter', '1', '2021-01-11', 'SD'); insert into tbl_partition values (8, 'Sally', '0', '2021-03-10', 'BCSC'); insert into tbl_partition values (9, 'Carrie', '0', '2021-04-02', 'BCSC'); insert into tbl_partition values (10, 'Lee', '1', '2021-01-05', 'BMC'); insert into tbl_partition values (11, 'Nicole', '0', '2020-11-10', 'PROJ'); insert into tbl_partition values (12, 'Renee', '0', '2021-01-10', 'TS'); |
查看主表数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | select * from tbl_partition; id | name | gender | join_date | dept ----+--------+--------+------------+------ 3 | Eagle | t | 2020-11-01 | TS 6 | Rock | t | 2020-11-10 | SD 11 | Nicole | f | 2020-11-10 | PROJ 4 | Miles | t | 2020-12-15 | SD 5 | Simon | t | 2020-12-10 | SD 1 | David | t | 2021-01-10 | TS 7 | Peter | t | 2021-01-11 | SD 10 | Lee | t | 2021-01-05 | BMC 12 | Renee | f | 2021-01-10 | TS 2 | Sandy | f | 2021-02-10 | TS 8 | Sally | f | 2021-03-10 | BCSC 9 | Carrie | f | 2021-04-02 | BCSC (12 rows) |
查看分区表数据
1 2 3 4 5 6 7 8 9 | select * from tbl_partition_202101 ; id | name | gender | join_date | dept ----+-------+--------+------------+------ 1 | David | t | 2021-01-10 | TS 7 | Peter | t | 2021-01-11 | SD 10 | Lee | t | 2021-01-05 | BMC 12 | Renee | f | 2021-01-10 | TS (4 rows) |
PG 10 新特性
PG10之前实现分区表功能,基本是根据“继承表+约束+规则或触发器”实现,相对于之前的分区实现方式,PG10的分区特性有以下优势:
1)管理分区方便
2)数据插入效率高
注意:
- 主表和分区分别单独创建
- 范围分区支持多个字段组成的KEY
- 列表分区的KEY只能有一个字段
创建主表语法:
1 2 3 | CREATE TABLE 表名 ( [{ 列名称 数据_类型} [, … ] ] ) PARTITION BY RANGE ( [{ 列名称 } [, …] ] ); 范围分区的KEY值可由多个字段组成(最多32个字段)。 |
创建分区语法:
1 2 3 | CREATE TABLE 表名 PARTITION OF 主表 FOR VALUES FROM{ ( 表达式 [, …] ) | MINVALUE } [, …] TO { ( 表达式 [, …] ) | MAXVALUE } [, …] [ TABLESPACE 表空间名 ]; |
参数说明:
// FROM … TO 表示分区的起始值和结束值。
// MINVALUE / MAXVALUE 表示无限小值和无限大值。
// 默认FROM后面的值是包括值分区的约束内,TO后面的值不包括。
范围分区
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | create table tbp(n int, t text) partition by range(n); create table tbp_1 partition of tbp for values from (MINVALUE) to (10); create table tbp_2 partition of tbp for values from (10) to (100); create table tbp_3 partition of tbp for values from (100) to (1000); create table tbp_4 partition of tbp for values from (1000) to (MAXVALUE); lhrdb=# \d List of relations Schema | Name | Type | Owner --------+-------+-------------------+---------- public | tbp | partitioned table | postgres public | tbp_1 | table | postgres public | tbp_2 | table | postgres public | tbp_3 | table | postgres public | tbp_4 | table | postgres (5 rows) lhrdb=# select * from pg_partitioned_table; partrelid | partstrat | partnatts | partdefid | partattrs | partclass | partcollation | partexprs -----------+-----------+-----------+-----------+-----------+-----------+---------------+----------- 41034 | r | 1 | 0 | 1 | 1978 | 0 | (1 row) lhrdb=# \d+ tbp Partitioned table "public.tbp" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- n | integer | | | | plain | | t | text | | | | extended | | Partition key: RANGE (n) Partitions: tbp_1 FOR VALUES FROM (MINVALUE) TO (10), tbp_2 FOR VALUES FROM (10) TO (100), tbp_3 FOR VALUES FROM (100) TO (1000), tbp_4 FOR VALUES FROM (1000) TO (MAXVALUE) -- 获取分区类型和KEY SELECT pg_get_partkeydef('tbp'::regclass); -- 获取分区范围 SELECT pg_get_partition_constraintdef('tbp_1'::regclass) ; lhrdb=# SELECT pg_get_partkeydef('tbp'::regclass); pg_get_partkeydef ------------------- RANGE (n) (1 row) lhrdb=# SELECT pg_get_partition_constraintdef('tbp_1'::regclass) ; pg_get_partition_constraintdef -------------------------------- ((n IS NOT NULL) AND (n < 10)) (1 row) |
多个key:
1 2 3 4 5 6 7 8 9 10 11 12 | create table test(n1 int, n2 int) partition by range(n1, n2); create table test_1 partition of test for values from (0, 0) to (10, 100); lhrdb=# \d+ test_1 Table "public.test_1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- n1 | integer | | | | plain | | n2 | integer | | | | plain | | Partition of: test FOR VALUES FROM (0, 0) TO (10, 100) Partition constraint: ((n1 IS NOT NULL) AND (n2 IS NOT NULL) AND ((n1 > 0) OR ((n1 = 0) AND (n2 >= 0))) AND ((n1 < 10) OR ((n1 = 10) AND (n2 < 100)))) Access method: heap |
n1范围从0到10,n2范围从0到100。
示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | create table test(n int) partition by range(n); create table test_1 partition of test for values from (MINVALUE) to (10); create table test_2 partition of test for values from (10) to (100); create table test_3 partition of test for values from (100) to (1000); create table test_4 partition of test for values from (1000) to (10000); lhrdb3=# \d+ test Partitioned table "public.test" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- n | integer | | | | plain | | Partition key: RANGE (n) Partitions: test_1 FOR VALUES FROM (MINVALUE) TO (10), test_2 FOR VALUES FROM (10) TO (100), test_3 FOR VALUES FROM (100) TO (1000), test_4 FOR VALUES FROM (1000) TO (10000) lhrdb3=# lhrdb3=# \d+ test_2 Table "public.test_2" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- n | integer | | | | plain | | Partition of: test FOR VALUES FROM (10) TO (100) Partition constraint: ((n IS NOT NULL) AND (n >= 10) AND (n < 100)) Access method: heap lhrdb3=# insert into test select generate_series(0, 9999); INSERT 0 10000 lhrdb3=# explain analyze select * from test; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Append (cost=0.00..337.75 rows=17850 width=4) (actual time=0.025..3.566 rows=10000 loops=1) -> Seq Scan on test_1 (cost=0.00..35.50 rows=2550 width=4) (actual time=0.024..0.027 rows=10 loops=1) -> Seq Scan on test_2 (cost=0.00..35.50 rows=2550 width=4) (actual time=0.017..0.028 rows=90 loops=1) -> Seq Scan on test_3 (cost=0.00..35.50 rows=2550 width=4) (actual time=0.029..0.204 rows=900 loops=1) -> Seq Scan on test_4 (cost=0.00..142.00 rows=10200 width=4) (actual time=0.029..2.049 rows=9000 loops=1) Planning Time: 0.175 ms Execution Time: 4.186 ms (7 rows) |
列表分区
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | CREATE TABLE sales (product_id int, saleroom int, province text) PARTITION BY LIST(province); CREATE TABLE sales_east PARTITION OF sales FOR VALUES IN ('山东','江苏','上海'); CREATE TABLE sales_west PARTITION OF sales FOR VALUES IN ('山西','陕西','四川'); CREATE TABLE sales_north PARTITION OF sales FOR VALUES IN ('北京','河北','辽宁'); CREATE TABLE sales_south PARTITION OF sales FOR VALUES IN ('广东','福建'); lhrdb2=# \dt List of relations Schema | Name | Type | Owner --------+-------------+-------------------+---------- public | sales | partitioned table | postgres public | sales_east | table | postgres public | sales_north | table | postgres public | sales_south | table | postgres public | sales_west | table | postgres (5 rows) lhrdb2=# insert into sales values(1,1,'山东'),(2,2,'山西'),(3,3,'北京'),(4,4,'广东'); INSERT 0 4 lhrdb2=# select * from sales; product_id | saleroom | province ------------+----------+---------- 1 | 1 | 山东 3 | 3 | 北京 2 | 2 | 山西 4 | 4 | 广东 (4 rows) lhrdb2=# select * from sales_east; product_id | saleroom | province ------------+----------+---------- 1 | 1 | 山东 (1 row) lhrdb2=# select * from sales_west; product_id | saleroom | province ------------+----------+---------- 2 | 2 | 山西 (1 row) |
绑定分区
分区表支持把普通表绑定成父表的一个分区,也支持把分区解绑为普通表。
若普通表中有数据,ATTACH操作时,默认会做数据校验。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | CREATE TABLE sales_foreign (like sales); ALTER TABLE sales ATTACH PARTITION sales_foreign FOR VALUES IN('美国','日本'); lhrdb2=# \d+ sales Partitioned table "public.sales" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ------------+---------+-----------+----------+---------+----------+--------------+------------- product_id | integer | | | | plain | | saleroom | integer | | | | plain | | province | text | | | | extended | | Partition key: LIST (province) Partitions: sales_east FOR VALUES IN ('山东', '江苏', '上海'), sales_foreign FOR VALUES IN ('美国', '日本'), sales_north FOR VALUES IN ('北京', '河北', '辽宁'), sales_south FOR VALUES IN ('广东', '福建'), sales_west FOR VALUES IN ('山西', '陕西', '四川') lhrdb2=# ALTER TABLE sales DETACH PARTITION sales_foreign; ALTER TABLE lhrdb2=# \d+ sales Partitioned table "public.sales" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ------------+---------+-----------+----------+---------+----------+--------------+------------- product_id | integer | | | | plain | | saleroom | integer | | | | plain | | province | text | | | | extended | | Partition key: LIST (province) Partitions: sales_east FOR VALUES IN ('山东', '江苏', '上海'), sales_north FOR VALUES IN ('北京', '河北', '辽宁'), sales_south FOR VALUES IN ('广东', '福建'), sales_west FOR VALUES IN ('山西', '陕西', '四川') |
查询
获取系统信息(系统表):
pg_partitioned_table 记录主表信息的系统表:
PG 11新特性
PostgreSQL 11 为分区表功能提供更多的改进。这些特性包括:hash 分区、索引增强、DML改进,以及性能优化:faster partition pruning、run-time partition pruning,、partition-wise join。
- pg11新增了hash partition
- pg11中可以建立default partition(默认分区)
- pg11中可以对partition key进行更新
- pg11中在partitioned table(主表)上建立索引,索引会自动建立在partition(子表)上
- pg11中可以在partitioned table上建立unique constraints
- 在pg11中attach分区时,会自动给该新分区建立唯一约束和索引
- pg_partition_tree函数,返回分区表详细信息,例如分区名称、上一级分区名称、是否叶子结点、层级,层级为零表示顶层主表。
- pg_partition_root函数用于当已知partition时查询出分区表顶层主表的名字
- pg_partition_ancestors返回上层分区名称,包括本层分区名称。
hash分区语法
1 2 3 4 5 6 7 8 9 | -- 主表 CREATE TABLE table_name ( column_name data_type ) PARTITION BY HASH ( { column_name } [, ... ] ) -- 子表 CREATE TABLE table_name PARTITION OF parent_table FOR VALUES WITH ( MODULUS numeric_literal, REMAINDER numeric_literal ) |
哈希分区支持多列分区,下面给出示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | create table test_hash(id int, date date) partition by hash(id); create table test_hash_1 partition of test_hash for values with(modulus 2, remainder 0); create table test_hash_2 partition of test_hash for values with(modulus 2, remainder 1); \d+ test_hash Table "public.test_hash" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: HASH (id) Partitions: test_hash_1 FOR VALUES WITH (modulus 2, remainder 0), test_hash_2 FOR VALUES WITH (modulus 2, remainder 1) \d+ test_hash_1 Table "public.test_hash_1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition of: test_hash FOR VALUES WITH (modulus 2, remainder 0) Partition constraint: satisfies_hash_partition('16603'::oid, 2, 0, id) |
示例2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | create table test_hash_key(x int, y int) partition by hash(x,y); create table test_hash_key_1 partition of test_hash_key for values with(modulus 2, remainder 0); create table test_hash_key_2 partition of test_hash_key for values with(modulus 2, remainder 1); \d+ test_hash_key Table "public.test_hash_key" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | | | plain | | y | integer | | | | plain | | Partition key: HASH (x, y) Partitions: test_hash_key_1 FOR VALUES WITH (modulus 2, remainder 0), test_hash_key_2 FOR VALUES WITH (modulus 2, remainder 1) \d+ test_hash_key_1 Table "public.test_hash_key_1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | | | plain | | y | integer | | | | plain | | Partition of: test_hash_key FOR VALUES WITH (modulus 2, remainder 0) Partition constraint: satisfies_hash_partition('16561'::oid, 2, 0, x, y) |
默认分区
PostgreSQL 11新特性,防止插入失败,对于不符合分区约束的数据将会插入到默认分区。目前,range/list支持默认分区,hash分区不支持
语法:
1 2 3 4 | CREATE TABLE table_name PARTITION OF parent_table FOR VALUES DEFAULT |
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | -- range create table test_range_default partition of test_range default; \d+ test_range Table "public.test_range" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: RANGE (date) Partitions: test_range_201801 FOR VALUES FROM ('2018-01-01') TO ('2018-02-01'), test_range_201802 FOR VALUES FROM ('2018-02-01') TO ('2018-03-01'), test_range_default DEFAULT \d+ test_range_default Table "public.test_range_default" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition of: test_range DEFAULT Partition constraint: (NOT ((date IS NOT NULL) AND (((date >= '2018-01-01'::date) AND (date < '2018-02-01'::date)) OR ((date >= '2018-02-01'::date) AND (date < '2018-03-01'::date))))) -- list create table test_list_default partition of test_list default; \d+ test_list Table "public.test_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | city | text | | | | extended | | Partition key: LIST (city) Partitions: test_list_hz FOR VALUES IN ('杭州'), test_list_jn FOR VALUES IN ('济南'), test_list_default DEFAULT \d+ test_list_default Table "public.test_list_default" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | city | text | | | | extended | | Partition of: test_list DEFAULT Partition constraint: (NOT ((city IS NOT NULL) AND (city = ANY (ARRAY['杭州'::text, '济南'::text])))) -- hash create table test_hash_default partition of test_hash default; ERROR: a hash-partitioned table may not have a default partition |
默认分区可以防止插入失败:
1 2 3 4 5 6 7 8 9 | insert into test_list values (3, '北京'); select *, tableoid::regclass from test_list; id | city | tableoid ----+------+------------------- 1 | 杭州 | test_list_hz 2 | 济南 | test_list_jn 3 | 北京 | test_list_default (3 rows) |
多级分区
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | create table test_range_list(id int, city text, date date) partition by list (city); create table test_range_list_jn partition of test_range_list for values in ('济南') partition by range(date); create table test_range_list_hz partition of test_range_list for values in ('杭州') partition by range(date); create table test_range_list_jn_201801 partition of test_range_list_jn for values from ('2018-01-01') to ('2018-02-01'); create table test_range_list_jn_201802 partition of test_range_list_jn for values from ('2018-02-01') to ('2018-03-01'); create table test_range_list_hz_201801 partition of test_range_list_hz for values from ('2018-01-01') to ('2018-02-01'); create table test_range_list_hz_201802 partition of test_range_list_hz for values from ('2018-02-01') to ('2018-03-01'); \d+ test_range_list Table "public.test_range_list" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | city | text | | | | extended | | date | date | | | | plain | | Partition key: LIST (city) Partitions: test_range_list_hz FOR VALUES IN ('杭州'), PARTITIONED, test_range_list_jn FOR VALUES IN ('济南'), PARTITIONED \d+ test_range_list_jn Table "public.test_range_list_jn" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | city | text | | | | extended | | date | date | | | | plain | | Partition of: test_range_list FOR VALUES IN ('济南') Partition constraint: ((city IS NOT NULL) AND (city = '济南'::text)) Partition key: RANGE (date) Partitions: test_range_list_jn_201801 FOR VALUES FROM ('2018-01-01') TO ('2018-02-01'), test_range_list_jn_201802 FOR VALUES FROM ('2018-02-01') TO ('2018-03-01') \d+ test_range_list_jn_201801 Table "public.test_range_list_jn_201801" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | | | plain | | city | text | | | | extended | | date | date | | | | plain | | Partition of: test_range_list_jn FOR VALUES FROM ('2018-01-01') TO ('2018-02-01') Partition constraint: ((city IS NOT NULL) AND (city = '济南'::text) AND (date IS NOT NULL) AND (date >= '2018-01-01'::date) AND (date < '2018-02-01'::date)) |
ATTACH/DETACH 分区
语法:
1 2 3 4 5 | ALTER TABLE name ATTACH PARTITION partition_name { FOR VALUES partition_bound_spec | DEFAULT } ALTER TABLE name DETACH PARTITION partition_name |
以hash分区示例进行演示
detach 示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | \d+ test_hash Table "public.test_hash" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: HASH (id) Partitions: test_hash_1 FOR VALUES WITH (modulus 2, remainder 0), test_hash_2 FOR VALUES WITH (modulus 2, remainder 1) alter table test_hash detach partition test_hash_2; \d+ test_hash Table "public.test_hash" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: HASH (id) Partitions: test_hash_1 FOR VALUES WITH (modulus 2, remainder 0) |
attach 示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | \d+ test_hash Table "public.test_hash" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: HASH (id) Partitions: test_hash_1 FOR VALUES WITH (modulus 2, remainder 0) create table test_hash_attach (id int, date date); alter table test_hash attach partition test_hash_attach for values with (modulus 2, remainder 1); \d+ test_hash Table "public.test_hash" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: HASH (id) Partitions: test_hash_1 FOR VALUES WITH (modulus 2, remainder 0), test_hash_attach FOR VALUES WITH (modulus 2, remainder 1) |
外部表做为分区表
简单介绍如何添加外部表作为分区表,另外还有10版本存在的几个疑问是否得到解决,详见示例。
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | \c postgres_fdw --创建外部表,加入测试数据 create table test_fdw(id int, date date); insert into test_fdw values (1, '2016-01-01'); insert into test_fdw values (2, '2016-12-01'); select * from test_fdw ; id | date ----+------------ 1 | 2016-01-01 2 | 2016-12-01 (2 rows) \c postgres -- 创建 postgres_fdw extension create extension postgres_fdw ; \dx List of installed extensions Name | Version | Schema | Description --------------+---------+------------+---------------------------------------------------- plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers (2 rows) -- 创建外部服务器 create server foreign_server foreign data wrapper postgres_fdw options (dbname 'postgres_fdw'); --创建外部服务器用户标识 create user MAPPING FOR postgres server foreign_server options (user 'postgres', password ''); --创建外部表 create foreign table test_fdw(id int, date date) server foreign_server options (schema_name 'public', table_name 'test_fdw'); select * from test_fdw ; id | date ----+------------ 1 | 2016-01-01 2 | 2016-12-01 (2 rows) postgres=# \d+ test_range; Table "public.test_range" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: RANGE (date) Partitions: test_range_201801 FOR VALUES FROM ('2018-01-01') TO ('2018-02-01'), test_range_201802 FOR VALUES FROM ('2018-02-01') TO ('2018-03-01') --添加外部表为分区表 alter table test_range attach partition test_fdw for values from ('2000-01-01') to ('2018-01-01'); \d+ test_range; Table "public.test_range" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | | | plain | | date | date | | | | plain | | Partition key: RANGE (date) Partitions: test_fdw FOR VALUES FROM ('2000-01-01') TO ('2018-01-01'), test_range_201801 FOR VALUES FROM ('2018-01-01') TO ('2018-02-01'), test_range_201802 FOR VALUES FROM ('2018-02-01') TO ('2018-03-01') select *, tableoid::regclass from test_range; id | date | tableoid ----+------------+---------- 1 | 2016-01-01 | test_fdw 2 | 2016-12-01 | test_fdw (2 rows) |
索引增强
主要以下改进:
- 分区主表可以有索引
- 分区主表可以创建唯一约束,需要包含分区键(示例 2)
- 分区主表可以创建外键,但是不能参照分区表创建外键(示例 3)
- 分区主表可以创建 FOR EACH ROW 触发器
前两条换句话说,可以创建主键
示例 1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | -- 10 create table pk (id int primary key) partition by range(id); ERROR: primary key constraints are not supported on partitioned tables LINE 1: create table pk (id int primary key) partition by range(id); -- 11 create table pk (id int primary key) partition by range(id); create table pk1 partition of pk for values from (1) to (10); \d+ pk Table "public.pk" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | not null | | plain | | Partition key: RANGE (id) Indexes: "pk_pkey" PRIMARY KEY, btree (id) Partitions: pk1 FOR VALUES FROM (1) TO (10) \d+ pk1 Table "public.pk1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- id | integer | | not null | | plain | | Partition of: pk FOR VALUES FROM (1) TO (10) Partition constraint: ((id IS NOT NULL) AND (id >= 1) AND (id < 10)) Indexes: "pk1_pkey" PRIMARY KEY, btree (id) |
在以前的版本,主表不能创建索引,需要单独在子表创建,可以发现,主表创建的索引,自动在子表创建了。
示例 2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | -- 唯一约束不包含分区键报错 create table test_unique(x int, y int unique) partition by range(x); ERROR: insufficient columns in UNIQUE constraint definition DETAIL: UNIQUE constraint on table "test_unique" lacks column "x" which is part of the partition key. create table test_unique(x int primary key, y int) partition by range(x); \d+ test_unique Table "public.test_unique" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | not null | | plain | | y | integer | | | | plain | | Partition key: RANGE (x) Indexes: "test_unique_pkey" PRIMARY KEY, btree (x) Number of partitions: 0 -- 非分区键创建hash索引 create index idx_hash on test_unique using hash(y); \d+ test_unique Table "public.test_unique" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | not null | | plain | | y | integer | | | | plain | | Partition key: RANGE (x) Indexes: "test_unique_pkey" PRIMARY KEY, btree (x) "idx_hash" hash (y) Number of partitions: 0 |
示例 3:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | -- 参照分区表创建外键失败 create table test_fk(a int references test_unique(x), b int) partition by range(a); ERROR: cannot reference partitioned table "test_unique" -- 创建普通表 create table test_ref(x int primary key, y int); \d+ test_ref Table "public.test_ref" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | not null | | plain | | y | integer | | | | plain | | Indexes: "test_ref_pkey" PRIMARY KEY, btree (x) -- 分区表带外键 create table test_fk(a int references test_ref(x), b int) partition by range(a); \d+ test_fk Table "public.test_fk" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- a | integer | | | | plain | | b | integer | | | | plain | | Partition key: RANGE (a) Foreign-key constraints: "test_fk_a_fkey" FOREIGN KEY (a) REFERENCES test_ref(x) Number of partitions: 0 \d+ test_ref Table "public.test_ref" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- x | integer | | not null | | plain | | y | integer | | | | plain | | Indexes: "test_ref_pkey" PRIMARY KEY, btree (x) Referenced by: TABLE "test_fk" CONSTRAINT "test_fk_a_fkey" FOREIGN KEY (a) REFERENCES test_ref(x) |
示例 4:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | -- 10 create trigger test_trigger after insert on test_unique for each row execute procedure test_func(); ERROR: "test_unique" is a partitioned table DETAIL: Partitioned tables cannot have ROW triggers. -- 11 -- 创建函数 create function test_func() returns trigger as $$ begin insert into test_trigger values (111); return NEW; end; $$ language plpgsql; -- 创建触发器 create trigger test_trigger after insert on test_unique for each row execute procedure test_func(); -- 插入数据测试 insert into test_unique values (2, 2); select * from test_unique; x | y ---+--- 1 | 2 2 | 2 (2 rows) select * from test_trigger ; id ----- 111 (1 row) |
DML改进
有以下三处改进,给出简单示例。
UPDATE可以在分区之间移动行
如果更新的行满足之前的分区约束,这依旧在原分区,反之,则会找到应该包含该元组的分区,插入到新分区,删除旧分区记录。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | select *, tableoid::regclass from test_list; id | city | tableoid ----+------+------------------- 1 | 杭州 | test_list_hz 2 | 济南 | test_list_jn 3 | 北京 | test_list_default (3 rows) update test_list set city = '济南' where id = 3; select *, tableoid::regclass from test_list; id | city | tableoid ----+------+-------------- 1 | 杭州 | test_list_hz 2 | 济南 | test_list_jn 3 | 济南 | test_list_jn (3 rows) |
INSERT/COPY可以路由到外部分区
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | --插入数据 -- 10 insert into test_range values (3, '2000-01-01'); ERROR: cannot route inserted tuples to a foreign table -- 11 insert into test_range values (3, '2000-01-01'); select *, tableoid::regclass from test_range; id | date | tableoid ----+------------+---------- 1 | 2018-01-01 | test_range_201801 2 | 2018-02-01 | test_range_201802 3 | 2000-01-01 | test_fdw (3 rows) |
注意:这里可以与上面的UPDATE功能一起使用,可以将元组移动到外部分区,不能从外部分区移动。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | -- 与UPDATE一起使用 select *, tableoid::regclass from test_range; id | date | tableoid ----+------------+------------------- 1 | 2018-01-01 | test_range_201801 2 | 2018-02-01 | test_range_201802 3 | 2000-01-01 | test_fdw (3 rows) update test_range set date = '2000-01-01' where id = 2; select *, tableoid::regclass from test_range; id | date | tableoid ----+------------+------------------- 1 | 2018-01-01 | test_range_201801 2 | 2000-01-01 | test_fdw 3 | 2000-01-01 | test_fdw (3 rows) update test_range set date = '2018-01-01' where id = 3; select *, tableoid::regclass from test_range; id | date | tableoid ----+------------+------------------- 1 | 2018-01-01 | test_range_201801 2 | 2000-01-01 | test_fdw 3 | 2018-01-01 | test_fdw (3 rows) |
INSERT .. ON CONFLICT DO UPDATE/NOTHING
分区表也支持insert on conflict 了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | insert into test_unique values (1, 1); select *, tableoid::regclass from test_unique; x | y | tableoid ---+---+--------------- 1 | 1 | test_unique_1 (1 row) insert into test_unique values (1, 1) on conflict do nothing; select *, tableoid::regclass from test_unique; x | y | tableoid ---+---+--------------- 1 | 1 | test_unique_1 (1 row) insert into test_unique values (1, 1) on conflict(x) do update set y = 2; select *, tableoid::regclass from test_unique; x | y | tableoid ---+---+--------------- 1 | 2 | test_unique_1 (1 row) |
管理分区
移除数据/分区
实现分区表之后,我们就可以很容易地移除不再使用的旧数据了,最简单的方法就是:
1 | drop table tbl_partition_202104; |
这样可以快速移除大量数据,而不是逐条删除数据。
另一个推荐做法是将分区从分区表中移除,但是保留访问权限。
1 | alter table tbl_partition_202104 no inherit tbl_partition; |
和直接DROP 相比,该方式仅仅是使子表脱离了原有的主表,而存储在子表中的数据仍然可以得到访问,因为此时该表已经被还原成一个普通的数据表了。这样对于数据库的DBA来说,就可以在此时对该表进行必要的维护操作,如数据清理、归档等,在完成诸多例行性的操作之后,就可以考虑是直接删除该表(DROP TABLE),还是先清空该表的数据(TRUNCATE TABLE),之后再让该表重新继承主表。
1 2 | alter table tbl_partition_202104 inherit tbl_partition; |
增加分区
我们可以像之前那样增加一个分区
1 2 3 4 5 | create table tbl_partition_202106 ( check ( join_date >= DATE '2021-06-01' AND join_date < DATE '2021-07-01' ) ) INHERITS (tbl_partition); create index tbl_partition_202106_joindate on tbl_partition_202106 (join_date); |
同时,需要修改触发器函数,将插入条件改成相应的值。
说明:创建触发器函数时,最好把插入条件写更未来一点,比如多写十年,这样以后增加新分区时就不需要重新创建触发器函数了,也可以避免一些不必要的错误。
绑定分区
另外,还可以如下增加新的分区:
1 2 3 4 5 6 7 8 9 10 11 12 | create table tbl_partition_202107 (LIKE tbl_partition INCLUDING DEFAULTS INCLUDING CONSTRAINTS); alter table tbl_partition_202107 add constraint tbl_partition_202107_join_date_check check ( join_date >= DATE '2021-07-01' AND join_date < DATE '2021-08-01' ); create index tbl_partition_202107_joindate on tbl_partition_202107 (join_date); copy tbl_partition_202107 from '/tmp/tbl_partition_202107.sql'; -- 从文件中拷贝数据,这些数据可以是事前准备的 alter table tbl_partition_202107 inherit tbl_partition; |
约束排除
约束排除(Constraint exclusion)是一种查询优化技巧,应用在分区特性中,可以提高分区表检索性能,这也是使用分区特性关键的一点。
注:constraint_exclusion的默认值是partition,是介于off和on之间的一个中间值,用来只对继承表和分区表做检查约束,on是对所有的表做检查约束,off是不做检查约束。
确保postgresql.conf 里的配置参数constraint_exclusion 是打开的。没有这个参数,查询不会按照需要进行优化。这里我们需要做的是确保该选项在配置文件中没有被注释掉。
1 2 3 4 5 | lhrdb2=# show constraint_exclusion; constraint_exclusion ---------------------- partition (1 row) |
如果没有约束排除,查询会扫描tbl_partition 表中的每一个分区。打开了约束排除之后,规划器将检查每个分区的约束然后再试图证明该分区不需要被扫描,因为它不能包含任何符合WHERE子句条件的数据行。如果规划器可以证明这个,它就把该分区从查询规划里排除出去。
可以使用EXPLAIN 命令显示一个规划在constraint_exclusion 关闭和打开情况下的不同:
约束排除关闭
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | set constraint_exclusion = off; explain select count(*) from tbl_partition where join_date >= DATE '2021-04-01'; QUERY PLAN ------------------------------------------------------------------------------------------------- Aggregate (cost=172.80..172.81 rows=1 width=0) -> Append (cost=0.00..167.62 rows=2071 width=0) -> Seq Scan on tbl_partition (cost=0.00..0.00 rows=1 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202011 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202012 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202101 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202102 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202103 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202105 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202104 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202106 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202107 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) (22 rows) |
从上面的查询计划中可以看出,PostgreSQL 扫描了所有分区。下面我们再看一下打开约束排除之后的查询计划:
约束排除开启
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | set constraint_exclusion = on; explain select count(*) from tbl_partition where join_date >= DATE '2021-04-01'; QUERY PLAN ------------------------------------------------------------------------------------------------- Aggregate (cost=76.80..76.81 rows=1 width=0) -> Append (cost=0.00..74.50 rows=921 width=0) -> Seq Scan on tbl_partition (cost=0.00..0.00 rows=1 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202105 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202104 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202106 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) -> Seq Scan on tbl_partition_202107 tbl_partition (cost=0.00..18.62 rows=230 width=0) Filter: (join_date >= '2021-04-01'::date) (12 rows) |
可以看到,PostgreSQL 只扫描四月份以后的分区表。
VACUUM 或 ANALYZE tbl_partition 只会对主表起作用,要想分析表,需要分别分析每个分区表。