大家好, 继上一篇为大家介绍了如何查看pg的sql历史和实时计划之后,本篇继续介绍一下如何固定sql的执行计划。
对于维护生产数据库的小伙伴来说,80%的数据库故障源于运行项目中的sql本身,如果sql能够拥有稳定合理的执行计划,那么生产数据库就能保持稳定运行。
熟悉oracle的小伙伴都知道,无论从五环八门的hint 种类来说, 还是oracle从最早的9i 中的 store outline, 到10g的sql profile 再到11g的 spm (sql plan management)
可见业内rdbms旗舰产品一直在进化自己固定执行计划的功能。
pg 作为功能最强大的开源数据库,虽然原生pg并不支持,但是背后依托强大的插件机制和pg系数据库定制,你依然可以找到类似于替换的方案:
oracle hint -> pg extension pg_hint_plan
oracle outline -> pg extension pg_plan_guarantee
oracle spm -> aws aurora postgresql 的 query plan management (qpm)
我们先来看一下pg官方是否打算开发hint 功能, pg的学院派官方并不打算在原生的代码中加入hint 的功能: 理由是在其他数据库中(oracle和mysql) 中已经实现了, 我们对此并不感兴趣了。
很real 的答案。
虽然官方不支持,幸好还有强大第三方插件 pg_hint_plan :
github 的米乐appm6米乐安卓版下载官网下载首页地址: https://github.com/ossc-db/pg_hint_plan 来自于一家日本的公司(ntt)
下载的时候需要注意版本的问题: 我们测试的版本是 pg15 , 对应的兼容版本是 pg_hint_plan15 1.5。
每个pg版本都需要下载独立的pg_hint_plan: 比如 pg13 需要 pg_hint_plan13 1.3.8, pg14 需要 pg_hint_plan14 1.4.1
下载版本: pg_hint_plan15 1.5
infra [postgres@wqdcsrv3352 postgresql]# wget https://github.com/ossc-db/pg_hint_plan/archive/refs/tags/rel15_1_5_0.tar.gz
下载后的压缩包很小,只有196kb
infra [postgres@wqdcsrv3352 postgresql]# ls -lhtr | grep rel15_1_5_0.tar.gz
-rw------- 1 postgres postgres 196k jan 30 17:09 rel15_1_5_0.tar.gz
我们来解压安装一下:
infra [postgres@wqdcsrv3352 postgresql]# tar -xvf rel15_1_5_0.tar.gz
infra [postgres@wqdcsrv3352 postgresql]# cd pg_hint_plan-rel15_1_5_0
infra [postgres@wqdcsrv3352 pg_hint_plan-rel15_1_5_0]# make
gcc -std=gnu99 -wall -wmissing-prototypes -wpointer-arith -wdeclaration-after-statement -werror=vla -wendif-labels -wmissing-format-attribute -wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -o2 -fpic -i. -i./ -i/opt/postgresql/pg15/include/postgresql/server -i/opt/postgresql/pg15/include/postgresql/internal -d_gnu_source -c -o pg_hint_plan.o pg_hint_plan.c
gcc -std=gnu99 -wall -wmissing-prototypes -wpointer-arith -wdeclaration-after-statement -werror=vla -wendif-labels -wmissing-format-attribute -wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -o2 -fpic pg_hint_plan.o -l/opt/postgresql/pg15/lib -wl,--as-needed -wl,-rpath,'/opt/postgresql/pg15/lib',--enable-new-dtags -shared -o pg_hint_plan.so
infra [postgres@wqdcsrv3352 pg_hint_plan-rel15_1_5_0]# make install
/bin/mkdir -p '/opt/postgresql/pg15/share/postgresql/extension'
/bin/mkdir -p '/opt/postgresql/pg15/share/postgresql/extension'
/bin/mkdir -p '/opt/postgresql/pg15/lib/postgresql'
/bin/install -c -m 644 .//pg_hint_plan.control '/opt/postgresql/pg15/share/postgresql/extension/'
/bin/install -c -m 644 .//pg_hint_plan--1.3.0.sql .//pg_hint_plan--1.3.0--1.3.1.sql .//pg_hint_plan--1.3.1--1.3.2.sql .//pg_hint_plan--1.3.2--1.3.3.sql .//pg_hint_plan--1.3.3--1.3.4.sql .//pg_hint_plan--1.3.5--1.3.6.sql .//pg_hint_plan--1.3.4--1.3.5.sql .//pg_hint_plan--1.3.6--1.3.7.sql .//pg_hint_plan--1.3.7--1.3.8.sql .//pg_hint_plan--1.3.8--1.4.sql .//pg_hint_plan--1.4--1.4.1.sql .//pg_hint_plan--1.4.1--1.5.sql '/opt/postgresql/pg15/share/postgresql/extension/'
/bin/install -c -m 755 pg_hint_plan.so '/opt/postgresql/pg15/lib/postgresql/'
下面我们执行一个sql: 我们可以看到执行计划中 表之间的连接方式是nested loop ,表的访问方式是 index scan
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
pgbench@[local:/tmp]:1992=#8511 load 'auto_explain';
load
pgbench@[local:/tmp]:1992=#8511 set auto_explain.log_min_duration = 0;
set
pgbench@[local:/tmp]:1992=#8511 set client_min_messages = log;
set
pgbench@[local:/tmp]:1992=#8511 select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 1428.874 ms plan:
query text: select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
nested loop (cost=0.57..118918.94 rows=1600826 width=461)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.43..80219.93 rows=1600826 width=97)
-> memoize (cost=0.15..0.16 rows=1 width=364)
cache key: a.bid
cache mode: logical
-> index scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.15 rows=1 width=364)
index cond: (bid = a.bid)
下面我们使用 hint 修改表连接方式为hashjoin, 表的访问方式为 seq scan
这里注意与oracle的hint 所在位置不一样,pg的hint的位置是 sql语句的最前面,一个注释的代码块里面:
/ hashjoin(a b) seqscan(a) seqscan(b)/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
pgbench@[local:/tmp]:1992=#14342 /* hashjoin(a b) seqscan(a) seqscan(b)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: temporary file: path "base/pgsql_tmp/pgsql_tmp15324.0", size 58384384
log: temporary file: path "base/pgsql_tmp/pgsql_tmp15323.0", size 59637760
log: duration: 1517.595 ms plan:
query text: /* hashjoin(a b) seqscan(a) seqscan(b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
gather merge (cost=403460.65..559107.41 rows=1334022 width=461)
workers planned: 2
-> sort (cost=402460.62..404128.15 rows=667011 width=461)
sort key: a.aid
-> hash join (cost=31.36..43115.87 rows=667011 width=461)
hash cond: (a.bid = b.bid)
-> parallel seq scan on pgbench_accounts a (cost=0.00..33913.11 rows=667011 width=97)
-> hash (cost=31.16..31.16 rows=16 width=364)
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
log: temporary file: path "base/pgsql_tmp/pgsql_tmp14342.0", size 66461696
log: duration: 1523.931 ms statement: /* hashjoin(a b) seqscan(a) seqscan(b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
pg_hint_plan 这个插件,不需要 执行create extension 这个命令。 (create extension 的情况下有另外一种使用场景,最后我们会介绍一下)
同样支持 session 生效的方式 load 和 global 生效的方式 修改参数 shared_preload_libraries(重启生效)
如何知道 pg_hint_plan 支持多少种类型的hint 呢?
github 上下载的源码压缩包中有具体说明的网页 : pg_hint_plan-rel15_1_5_0/doc/hint_list.html
总结起来有如下几大类:
扫描方式相关的 (scan method)
表连接方式相关的(join method)
表连接顺序相关的( join order)
表的基数相关的,类似于oracle中的 cardinality(row number correction)
并行查询参数相关的( parallel query configuration)
参数设置相关 set(guc-param value)
内存化连接的inner table (behavior control on join)
我们来分别体验一下:
扫描方式相关的 (scan method): 测试 seqscan, noindexonlyscan
原始sql:
pgbench@[local:/tmp]:1992=#97268 select * from pgbench_branches where bid = 1;
log: duration: 0.036 ms plan:
query text: select * from pgbench_branches where bid = 1;
index scan using pgbench_branches_pkey on pgbench_branches (cost=0.14..8.17 rows=1 width=364)
index cond: (bid = 1)
添加hint : seqscan
pgbench@[local:/tmp]:1992=#97268 /* seqscan(a)*/
pgbench-# select * from pgbench_branches a where bid = 1;
log: duration: 0.057 ms plan:
query text: /* seqscan(a)*/
select * from pgbench_branches a where bid = 1;
seq scan on pgbench_branches a (cost=0.00..31.20 rows=1 width=364)
filter: (bid = 1)
原始sql:
pgbench@[local:/tmp]:1992=#97268 select bid from pgbench_branches a where bid > 0;
log: duration: 0.026 ms plan:
query text: select bid from pgbench_branches a where bid > 0;
index only scan using pgbench_branches_pkey on pgbench_branches a (cost=0.14..8.41 rows=16 width=4)
index cond: (bid > 0)
添加hint : noindexonlyscan
pgbench@[local:/tmp]:1992=#97268 /* noindexonlyscan(a)*/
pgbench-# select bid from pgbench_branches a where bid > 0;
log: duration: 0.092 ms plan:
query text: /* noindexonlyscan(a)*/
select bid from pgbench_branches a where bid > 0;
seq scan on pgbench_branches a (cost=0.00..31.20 rows=16 width=4)
filter: (bid > 0)
表连接方式相关的(join method): 这里涵盖了pg主要的几种连接方式 nestloop, hashjoin, mergejoin
添加 hint (nestloop) : pgbench_branches 作为inner 的内部表 cache到内存中(只有16条记录的小表),与outer table pgbench_accounts (1600000条记录)进行嵌套连接
pgbench@[local:/tmp]:1992=#97268 /* nestloop(a b)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 1447.723 ms plan:
query text: /* nestloop(a b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
nested loop (cost=0.57..118918.94 rows=1600826 width=461)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.43..80219.93 rows=1600826 width=97)
-> memoize (cost=0.15..0.16 rows=1 width=364)
cache key: a.bid
cache mode: logical
-> index scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.15 rows=1 width=364)
index cond: (bid = a.bid)
log: duration: 1448.326 ms statement: /* nestloop(a b)*/
添加 hint hashjoin: 同样是小表的 pgbench_branches 作为hash join 的驱动表
pgbench@[local:/tmp]:1992=#97268 /* hashjoin(a b)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 2037.850 ms plan:
query text: /* hashjoin(a b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
sort (cost=921120.72..925122.78 rows=1600826 width=461)
sort key: a.aid
-> hash join (cost=31.36..48597.87 rows=1600826 width=461)
hash cond: (a.bid = b.bid)
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=97)
-> hash (cost=31.16..31.16 rows=16 width=364)
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
log: temporary file: path "base/pgsql_tmp/pgsql_tmp97268.2", size 184451072
log: duration: 2052.505 ms statement: /* hashjoin(a b)*/
添加 hint mergejoin : merge join 一般作用于2个表连接条件的列上均存在索引,或者连接条件的列上最后返回的结果集需要进行排序
pgbench@[local:/tmp]:1992=#97268 /* mergejoin(a b)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.bid,b.bid;
log: duration: 1583.642 ms plan:
query text: /* mergejoin(a b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.bid,b.bid;
merge join (cost=290411.31..314423.78 rows=1600826 width=461)
merge cond: (a.bid = b.bid)
-> sort (cost=383310.46..387312.53 rows=1600826 width=97)
sort key: a.bid
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=97)
-> sort (cost=31.48..31.52 rows=16 width=364)
sort key: b.bid
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
log: temporary file: path "base/pgsql_tmp/pgsql_tmp97268.3", size 171589632
表连接顺序相关的( join order): leading
这个hint需要注意一下,当表的连接个数查过2个的时候,比如 leading (a b c) 这个的含义 并不等于 a->b->c
正确的写法是 leading (((a b) c)): 需要用小括号来固定一下:
原始sql: 表的连接顺序是 b-> c-> a 优化器选择一张最小的表 b 作为驱动表
query text: select count(1)
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
join pgbench_tellers c on b.bid = c.bid;
aggregate (cost=267421.73..267421.74 rows=1 width=8)
-> hash join (cost=54.83..227401.08 rows=16008260 width=0)
hash cond: (a.bid = b.bid)
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=4)
-> hash (cost=52.83..52.83 rows=160 width=8)
-> nested loop (cost=0.15..52.83 rows=160 width=8)
-> seq scan on pgbench_tellers c (cost=0.00..45.60 rows=160 width=4)
-> memoize (cost=0.15..0.21 rows=1 width=4)
cache key: c.bid
cache mode: logical
-> index only scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.20 rows=1 width=4)
index cond: (bid = c.bid)
我们使用hint修改一下表的连接顺序:leading(((c a) b)): c-> a-> b
query text: /* leading(((c a) b))*/
select count(1)
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
join pgbench_tellers c on b.bid = c.bid;
aggregate (cost=349860.63..349860.64 rows=1 width=8)
-> hash join (cost=69524.16..309839.98 rows=16008260 width=0)
hash cond: (a.bid = b.bid)
-> hash join (cost=69515.59..256678.99 rows=16008260 width=8)
hash cond: (c.bid = a.bid)
-> seq scan on pgbench_tellers c (cost=0.00..45.60 rows=160 width=4)
-> hash (cost=43251.26..43251.26 rows=1600826 width=4)
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=4)
-> hash (cost=8.38..8.38 rows=16 width=4)
-> index only scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..8.38 rows=16 width=4)
表的基数相关的,类似于oracle中的 cardinality(row number correction):
修改表的cardinality, cardinality 是个十分重要的因素,可以直接影响到执行计划。 之前很多oracle的sql优化都是基于cardinality 的优化
原始sql: 连接方式是 nested loop, 因为表 pgbench_branches b 是个只有16行的小表
query text: select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
nested loop (cost=0.57..118918.94 rows=1600826 width=461)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.43..80219.93 rows=1600826 width=97)
-> memoize (cost=0.15..0.16 rows=1 width=364)
cache key: a.bid
cache mode: logical
-> index scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.15 rows=1 width=364)
index cond: (bid = a.bid)
我们现在 添加hint 把pgbench_accounts a 这个的rows 设置为 1: (之前是1600826 )
我们看到表的连接当时变成了hash join , pgbench_accounts a 也变成了 seq scan 的方式
pgbench@[local:/tmp]:1992=#20579 /* rows(b a #1)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 2015.260 ms plan:
query text: /* rows(b a #1)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
sort (cost=48597.93..48597.93 rows=1 width=461)
sort key: a.aid
-> hash join (cost=31.36..48597.87 rows=1 width=461)
hash cond: (a.bid = b.bid)
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=97)
-> hash (cost=31.16..31.16 rows=16 width=364)
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
并行查询参数相关的( parallel query configuration): parallel(a 8 soft|hard)
soft 选项是 相当于设置了参数 max_parallel_worker_per_gather, 然后去让优化器去实际的分配 worker的数量
hard 选项是 强制分配worker的数量
parallel(a 8 soft) --》 workers planned: 3
query text: /* hashjoin(a b) parallel(a 8 soft)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
gather merge (cost=279990.89..463311.12 rows=1549185 width=461)
workers planned: 3
-> sort (cost=278990.85..280281.84 rows=516395 width=461)
sort key: a.aid
-> hash join (cost=31.36..1745.96 rows=516395 width=461)
hash cond: (a.bid = b.bid)
-> parallel seq scan on pgbench_accounts a (cost=0.00..0.00 rows=516395 width=97)
-> hash (cost=31.16..31.16 rows=16 width=364)
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
parallel(a 8 hard) --》workers planned: 8
query text: /* hashjoin(a b) parallel(a 8 hard)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
gather merge (cost=107764.93..305726.23 rows=1600824 width=461)
workers planned: 8
-> sort (cost=106764.79..107265.04 rows=200103 width=461)
sort key: a.aid
-> hash join (cost=31.36..695.76 rows=200103 width=461)
hash cond: (a.bid = b.bid)
-> parallel seq scan on pgbench_accounts a (cost=0.00..0.00 rows=200103 width=97)
-> hash (cost=31.16..31.16 rows=16 width=364)
-> seq scan on pgbench_branches b (cost=0.00..31.16 rows=16 width=364)
参数设置相关 set(guc-param value):
相当于session 级别修改了参数的默认值
缩小 work_mem和 temp_buffers 为2mb: 我们可以看到产生了临时文件:
log: temporary file: path “base/pgsql_tmp/pgsql_tmp63027.0”, size 55083008
log: temporary file: path “base/pgsql_tmp/pgsql_tmp63028.0”, size 51871744
pgbench@[local:/tmp]:1992=#54795 /* set(work_mem 2mb)
pgbench*# set(temp_buffers 2mb)*/
pgbench-# select *
pgbench-# from
pgbench-# pgbench_accounts a order by a.bid;
log: temporary file: path "base/pgsql_tmp/pgsql_tmp63027.0", size 55083008
log: temporary file: path "base/pgsql_tmp/pgsql_tmp63028.0", size 51871744
log: duration: 1051.854 ms plan:
query text: /* set(work_mem 2mb)
set(temp_buffers 2mb)*/
select *
from
pgbench_accounts a order by a.bid;
gather merge (cost=137889.63..293536.39 rows=1334022 width=97)
workers planned: 2
-> sort (cost=136889.60..138557.13 rows=667011 width=97)
sort key: bid
-> parallel seq scan on pgbench_accounts a (cost=0.00..33913.11 rows=667011 width=97)
hint 设置参数 为 64mb: set(temp_buffers 64mb)set(work_mem 64mb)
我们看到无需临时文件的产生。
pgbench@[local:/tmp]:1992=#54795 /* set(work_mem 64mb)
pgbench*# set(temp_buffers 64mb)*/
pgbench-# select *
pgbench-# from
pgbench-# pgbench_accounts a order by a.bid;
log: duration: 1343.024 ms plan:
query text: /* set(work_mem 64mb)
set(temp_buffers 64mb)*/
select *
from
pgbench_accounts a order by a.bid;
sort (cost=202833.93..206836.00 rows=1600826 width=97)
sort key: bid
-> seq scan on pgbench_accounts a (cost=0.00..43251.26 rows=1600826 width=97)
内存化连接的inner table (behavior control on join):
原始sql: 观察执行计划中
-> memoize (cost=0.15…0.16 rows=1 width=364)
cache key: a.bid
cache mode: logical
缓存了inner table (pgbench_branches) 的结果集, 加快了 nested loop的速度。
pgbench@[local:/tmp]:1992=#54795 select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 1471.256 ms plan:
query text: select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
nested loop (cost=0.57..118918.94 rows=1600826 width=461)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.43..80219.93 rows=1600826 width=97)
-> memoize (cost=0.15..0.16 rows=1 width=364)
cache key: a.bid
cache mode: logical
-> index scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.15 rows=1 width=364)
index cond: (bid = a.bid)
我们使用hint 去掉 缓存inner table (pgbench_branches) 的结果集: / nomemoize(a b)/
我们看到由于没有缓存小表的结果集, 这个 nested loop 就变成了每一条都去loop 一下, 执行效率会慢很多。
inner table pgbench_branches 仅仅只有16 条的数据情况下,对比还是很明显的:
memoize innner table : duration: 1471.256 ms
nomemoize innner table : duration: 2124.444 ms
pgbench@[local:/tmp]:1992=#54795 /* nomemoize(a b)*/
pgbench-# select *
pgbench-# from pgbench_branches b
pgbench-# join pgbench_accounts a on b.bid = a.bid
pgbench-# order by a.aid;
log: duration: 2124.444 ms plan:
query text: /* nomemoize(a b)*/
select *
from pgbench_branches b
join pgbench_accounts a on b.bid = a.bid
order by a.aid;
nested loop (cost=0.56..323043.40 rows=1600826 width=461)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a (cost=0.43..80219.93 rows=1600826 width=97)
-> index scan using pgbench_branches_pkey on pgbench_branches b (cost=0.14..0.15 rows=1 width=364)
index cond: (bid = a.bid)
我们再来看一下pg_hint_plan 对于 subquery 的支持:
对于子查询的语句块:hint中必须要使用别名 any_subquery
in (select … {limit | offset …} …)
= any (select … {limit | offset …} …)
= some (select … {limit | offset …} …)
原始sql的语句:
pgbench@[local:/tmp]:1992=#130849 select *
pgbench-# from pgbench_accounts a1
pgbench-# where aid in (select bid from pgbench_accounts a2 limit 10);
log: duration: 0.068 ms plan:
query text: select *
from pgbench_accounts a1
where aid in (select bid from pgbench_accounts a2 limit 10);
merge semi join (cost=1.25..2.15 rows=10 width=97)
merge cond: (a1.aid = a2.bid)
-> index scan using pgbench_accounts_pkey on pgbench_accounts a1 (cost=0.43..80219.93 rows=1600826 width=97)
-> sort (cost=0.54..0.56 rows=10 width=4)
sort key: a2.bid
-> limit (cost=0.00..0.27 rows=10 width=4)
-> seq scan on pgbench_accounts a2 (cost=0.00..43251.26 rows=1600826 width=4)
我们在hint 中指定 子查询的语句的别名 any_subquery:
pgbench@[local:/tmp]:1992=#130849 /* hashjoin(a1 any_subquery)*/
pgbench-# select *
pgbench-# from pgbench_accounts a1
pgbench-# where aid in (select bid from pgbench_accounts a2 limit 10);
log: duration: 192.251 ms plan:
query text: /* hashjoin(a1 any_subquery)*/
select *
from pgbench_accounts a1
where aid in (select bid from pgbench_accounts a2 limit 10);
hash semi join (cost=0.50..47454.03 rows=10 width=97)
hash cond: (a1.aid = a2.bid)
-> seq scan on pgbench_accounts a1 (cost=0.00..43251.26 rows=1600826 width=97)
-> hash (cost=0.37..0.37 rows=10 width=4)
-> limit (cost=0.00..0.27 rows=10 width=4)
-> seq scan on pgbench_accounts a2 (cost=0.00..43251.26 rows=1600826 width=4)
okay, 上面我们介绍了pg_hint_plan的基本使用方式。
添加这些hint, 都是需要修改sql语句的, 有没有一种方法可以在不修改sql的情况下, 为正在运行的sql 添加hint 来改变执行计划?
答案是肯定的。
下面我们来看一个例子: 我们需要创建 extension : 目的是创建表 hint_plan.hints
--创建表 hint_plan.hints
pgbench@[local:/tmp]:1992=#54795 create extension pg_hint_plan;
create extension
-- 打开此功能
pgbench@[local:/tmp]:1992=#54795 set pg_hint_plan.enable_hint_table = on;
set
pgbench@[local:/tmp]:1992=#54795 \d hint_plan.hints
table "hint_plan.hints"
column | type | collation | nullable | default
------------------- --------- ----------- ---------- ---------------------------------------------
id | integer | | not null | nextval('hint_plan.hints_id_seq'::regclass)
norm_query_string | text | | not null |
application_name | text | | not null |
hints | text | | not null |
indexes:
"hints_pkey" primary key, btree (id)
"hints_norm_and_app" unique, btree (norm_query_string, application_name)
我们把一条简单的sql语句: 原本是触发索引访问的sql ‘select * from pgbench_branches a where bid = ?;’ 插入到 hint_plan.hints中,使其使用hint 文本: seqscan(a)
query text: insert into hint_plan.hints
(norm_query_string,
application_name,
hints)
values
('select * from pgbench_branches a where bid = ?;',
'',
'seqscan(a)');
insert 0 1
我们再次执行一下这个sql: select * from pgbench_branches a where bid = 100;
我们发现 开启pg_hint_plan.enable_hint_table = on; 这个功能后, sql会首先访问表 hint_plan.hints: 查看是否有绑定的hint
select hints from hint_plan.hints where norm_query_string = $1 and ( application_name = $2 or application_name = ‘’ ) order by application_name desc
如果存在的话,则使用该hint, 返回的执行计划是 seq scan on pgbench_branches a , 符合预期。
pgbench@[local:/tmp]:1992=#54795 select * from pgbench_branches a where bid = 100;
log: duration: 0.036 ms plan:
query text: select hints from hint_plan.hints where norm_query_string = $1 and ( application_name = $2 or application_name = '' ) order by application_name desc
sort (cost=11.35..11.35 rows=1 width=64)
sort key: application_name desc
-> bitmap heap scan on hints (cost=4.17..11.30 rows=1 width=64)
recheck cond: (norm_query_string = $1)
filter: ((application_name = $2) or (application_name = ''::text))
-> bitmap index scan on hints_norm_and_app (cost=0.00..4.17 rows=3 width=0)
index cond: (norm_query_string = $1)
log: duration: 0.033 ms plan:
query text: select * from pgbench_branches a where bid = 100;
seq scan on pgbench_branches a (cost=0.00..31.20 rows=1 width=364)
filter: (bid = 100)
bid | bbalance | filler
----- ---------- --------
(0 rows)
如果不想使用该功能,关闭即可:
pgbench@[local:/tmp]:1992=#54795 set pg_hint_plan.enable_hint_table = off;
set
pgbench@[local:/tmp]:1992=#54795 select * from pgbench_branches a where bid = 100;
log: duration: 0.016 ms plan:
query text: select * from pgbench_branches a where bid = 100;
index scan using pgbench_branches_pkey on pgbench_branches a (cost=0.14..8.17 rows=1 width=364)
最后我们总结一下:
1)pg的第三方插件pg_hint_plan 的支持hint 的类型主要有如下几大类:
扫描方式相关的 (scan method)
表连接方式相关的(join method)
表连接顺序相关的( join order)
表的基数相关的,类似于oracle中的 cardinality(row number correction)
并行查询参数相关的( parallel query configuration)
参数设置相关 set(guc-param value)
内存化连接的inner table (behavior control on join)
2)pg 的pg_hint_plan 插件中 hint 的一些写法 与 oracle是不同的:
a) 需要写在代码块的第一行
/* hint statement */
sql statement
b) hint leading 的写法 需要里面用小括号再次包装一层: 比如 a->b : leading ((a b)) , a->b->c : leading(((a b) c))
c)hint 中指定表的别名之大小写敏感的, 比如sql语句中表的别名 tbl 与 hint 中的别名tbl 是不匹配的
3)我们可以在不修改sql语句的情况下,为sql预计绑定hint , 从而达到改变sql执行计划的目的。
这个对于生产库运维的小伙伴来说,在十万火急的情况下, 是一种快速解决问题的方式。
先解决问题,才能有更多的时间分析sql执行不正确的根本原因: 到底是统计信息不准确? 修改/禁用了某些参数? 优化器的bug?
如果你能联系到相关开发人员,改写合理的sql才是解决问题的根本途径。
have a fun 🙂 !