最近有个客户咨询,他们一个系统有个sql在凌晨1点左右执行计划突然变差了,数据库为oracle 11.2.0.4 rac, 从awr看数据库该时段实例级几乎空闲,上线很久的业务,问题时间点无人为操作,sql特征是查询一个分区表,2个列上查询条件,并不包含分区键列, 其中有一个列使用了绑定变量,执行计划有原来使用绑定变量列的索引改为全表分区扫描,直到白天10点以后人为收集了统计信息恢复正常。 简单记录如下
awr 问题时间点负载
snap id | snap time | sessions | cursors/session | instances | |
---|---|---|---|---|---|
begin snap: | 70734 | 06-may-23 01:00:09 | 110 | 5.6 | 2 |
end snap: | 70735 | 06-may-23 01:30:12 | 113 | 5.8 | 2 |
elapsed: | 30.06 (mins) | ||||
db time: | 14.19 (mins) |
sql执行计划历史
从sqlhc中可以获取以下信息
historical sql statistics – delta (dba_hist_sqlstat)
performance metrics of execution plans of 4ruhsafdgfj0x.
this section includes data captured by awr. if this is a stand-by read-only database then the awr information below is from the primary database.
# | snap id | snaphot | inst id | plan hv | vers cnt | execs | fetch | loads | inval | parse calls | buffer gets | disk reads | direct writes | rows proc | elapsed time (secs) | cpu time (secs) | io time (secs) | conc time (secs) | appl time (secs) | clus time (secs) | plsql time (secs) | java time (secs) | optimizer mode | cost | opt env hv | parsing schema name | module | action | profile |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
42 | 70446 | 2023-04-30/01:00:30 | 2 | 1429561684 | 1 | 4 | 3 | 0 | 0 | 4 | 466106 | 5041 | 0 | 3 | 30.131 | 2.554 | 20.120 | 0.000 | 0.000 | 8.151 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
43 | 70447 | 2023-04-30/01:30:03 | 2 | 1429561684 | 1 | 413 | 414 | 0 | 0 | 413 | 4651023 | 255743 | 0 | 414 | 807.680 | 67.347 | 482.346 | 0.000 | 0.000 | 283.310 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
44 | 70494 | 2023-05-01/01:00:12 | 2 | 1429561684 | 1 | 4 | 3 | 0 | 0 | 4 | 466519 | 2297 | 0 | 3 | 12.578 | 1.410 | 10.190 | 0.000 | 0.000 | 1.121 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
45 | 70495 | 2023-05-01/01:30:19 | 2 | 1429561684 | 1 | 413 | 414 | 0 | 0 | 413 | 4652526 | 84065 | 0 | 414 | 135.009 | 20.475 | 95.321 | 0.000 | 0.000 | 22.211 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
46 | 70542 | 2023-05-02/01:00:40 | 1 | 1429561684 | 1 | 4 | 3 | 0 | 0 | 4 | 473338 | 6152 | 0 | 3 | 35.516 | 5.871 | 25.837 | 0.000 | 0.000 | 5.443 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
47 | 70543 | 2023-05-02/01:30:44 | 1 | 1429561684 | 1 | 369 | 370 | 0 | 0 | 369 | 4644657 | 523841 | 0 | 370 | 876.644 | 156.018 | 658.911 | 0.000 | 0.000 | 117.955 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
48 | 70590 | 2023-05-03/01:00:04 | 2 | 1429561684 | 1 | 1 | 0 | 0 | 0 | 1 | 90420 | 729 | 0 | 0 | 3.197 | 0.246 | 2.904 | 0.000 | 0.000 | 0.068 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
49 | 70591 | 2023-05-03/01:30:17 | 2 | 1429561684 | 1 | 416 | 417 | 0 | 0 | 416 | 5004051 | 316980 | 0 | 417 | 828.918 | 63.491 | 553.742 | 0.000 | 0.000 | 234.327 | 0.000 | 0.000 | all_rows | 138524 | 642354766 | usr | jdbc thin client | ||
50 | 70638 | 2023-05-04/01:00:28 | 1 | 1429561684 | 1 | 4 | 3 | 0 | 0 | 4 | 455614 | 5548 | 0 | 3 | 27.518 | 5.166 | 19.182 | 0.000 | 0.000 | 4.471 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
51 | 70639 | 2023-05-04/01:30:39 | 1 | 1429561684 | 1 | 413 | 414 | 0 | 0 | 413 | 4689575 | 293773 | 0 | 414 | 787.993 | 149.052 | 526.241 | 0.000 | 0.000 | 167.868 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
52 | 70686 | 2023-05-05/01:00:05 | 2 | 1429561684 | 1 | 4 | 3 | 0 | 0 | 4 | 419394 | 1871 | 0 | 3 | 4.713 | 1.230 | 3.296 | 0.000 | 0.000 | 0.249 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
53 | 70687 | 2023-05-05/01:30:12 | 2 | 1429561684 | 1 | 413 | 414 | 0 | 0 | 413 | 4736621 | 658419 | 0 | 414 | 900.719 | 65.645 | 732.501 | 0.000 | 0.000 | 125.709 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
54 | 70734 | 2023-05-06/01:00:09 | 1 | 1429561684 | 1 | 1 | 0 | 0 | 0 | 1 | 108909 | 0 | 0 | 0 | 9.698 | 3.649 | 0.000 | 0.000 | 0.000 | 7.581 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
55 | 70735 | 2023-05-06/01:30:12 | 1 | 1429561684 | 1 | 5 | 6 | 0 | 0 | 5 | 502520 | 41371 | 0 | 6 | 129.674 | 15.249 | 104.611 | 0.000 | 0.000 | 15.527 | 0.000 | 0.000 | all_rows | 135832 | 642354766 | usr | jdbc thin client | ||
56 | 70735 | 2023-05-06/01:30:12 | 2 | 1634677993 | 1 | 2 | 1 | 0 | 0 | 2 | 5011570 | 2053940 | 0 | 1 | 1669.041 | 110.873 | 1417.607 | 0.000 | 0.000 | 172.850 | 0.000 | 0.000 | all_rows | 740643 | 642354766 | usr | jdbc thin client | ||
57 | 70736 | 2023-05-06/02:00:19 | 2 | 1634677993 | 1 | 4 | 4 | 0 | 0 | 4 | 8814702 | 3512366 | 0 | 4 | 1804.871 | 187.616 | 1368.072 | 0.000 | 0.000 | 303.651 | 0.000 | 0.000 | all_rows | 740643 | 642354766 | usr | jdbc thin client | ||
58 | 70737 | 2023-05-06/02:30:32 | 2 | 1634677993 | 1 | 5 | 5 | 0 | 0 | 5 | 13510593 | 5394886 | 0 | 5 | 1812.430 | 291.556 | 1141.191 | 0.000 | 0.000 | 464.778 | 0.000 | 0.000 | all_rows | 740643 | 642354766 | usr | jdbc thin client | ||
59 | 70738 | 2023-05-06/03:00:08 | 2 | 1634677993 | 1 | 2 | 2 | 0 | 0 | 2 | 5496986 | 2193826 | 0 | 2 | 1776.219 | 121.269 | 1503.133 | 0.000 | 0.000 | 187.383 | 0.000 | 0.000 | all_rows | 740643 | 642354766 | usr | jdbc thin client |
note:
从以上可以确认是在2023-05-06/01:30前从实例2开始执行的sql执行计划发生改变,逻辑读和响应时间翻了近10倍。平时sql的执行次数并不多,so,你懂得
查看两个执行计划
inst: 2 child: 0 plan hash value: 1429561684 -------------------------------------------------------------------------------------------------------------------------- | id | operation | name | e-rows |e-bytes| cost (%cpu)| e-time | pstart| pstop | -------------------------------------------------------------------------------------------------------------------------- | 0 | select statement | | | | 1058 (100)| | | | | 1 | sort aggregate | | 1 | 27 | | | | | |* 2 | table access by global index rowid| tab_qwertyxxxxxx | 112 | 3024 | 1058 (0)| 00:00:13 | rowid | rowid | |* 3 | index range scan | inx_extxxx3 | 3634 | | 19 (0)| 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------- query block name / object alias (identified by operation id): ------------------------------------------------------------- 1 - sel$f5bb74e1 2 - sel$f5bb74e1 / o@sel$2 3 - sel$f5bb74e1 / o@sel$2 outline data ------------- /* begin_outline_data ignore_optim_embedded_hints optimizer_features_enable('11.2.0.4') db_version('11.2.0.4') opt_param('_b_tree_bitmap_plans' 'false') opt_param('_bloom_filter_enabled' 'false') opt_param('_optimizer_extended_cursor_sharing' 'none') opt_param('_optimizer_extended_cursor_sharing_rel' 'none') opt_param('_optimizer_adaptive_cursor_sharing' 'false') opt_param('_optimizer_use_feedback' 'false') all_rows outline_leaf(@"sel$f5bb74e1") merge(@"sel$2") outline(@"sel$1") outline(@"sel$2") index_rs_asc(@"sel$f5bb74e1" "o"@"sel$2" ("tab_qwertyxxxxxx"."extend3")) end_outline_data */ peeked binds (identified by position): -------------------------------------- 1 - (varchar2(30), csid=852): '221125' predicate information (identified by operation id): --------------------------------------------------- 2 - filter(("o"."queryxxxx">='2023/03/01 00:00:00' and "o"."queryxxxx"<='2023/03/31 00:00:00')) 3 - access("o"."exxxx3"=:1) plan hash value: 1634677993 ---------------------------------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | pstart| pstop | ---------------------------------------------------------------------------------------------------------- | 0 | select statement | | | | 740k(100)| | | | | 1 | sort aggregate | | 1 | 27 | | | | | | 2 | partition range all| | 86226 | 2273k| 740k (1)| 02:28:08 | 1 | 6 | | 3 | table access full | tab_qwertyxxxxxx | 86226 | 2273k| 740k (1)| 02:28:08 | 1 | 6 | ---------------------------------------------------------------------------------------------------------- query block name / object alias (identified by operation id): ------------------------------------------------------------- 1 - sel$f5bb74e1 3 - sel$f5bb74e1 / o@sel$2 outline data ------------- /* begin_outline_data ignore_optim_embedded_hints optimizer_features_enable('11.2.0.4') db_version('11.2.0.4') opt_param('_b_tree_bitmap_plans' 'false') opt_param('_bloom_filter_enabled' 'false') opt_param('_optimizer_extended_cursor_sharing' 'none') opt_param('_optimizer_extended_cursor_sharing_rel' 'none') opt_param('_optimizer_adaptive_cursor_sharing' 'false') opt_param('_optimizer_use_feedback' 'false') all_rows outline_leaf(@"sel$f5bb74e1") merge(@"sel$2") outline(@"sel$1") outline(@"sel$2") full(@"sel$f5bb74e1" "o"@"sel$2") end_outline_data */ peeked binds (identified by position): -------------------------------------- 1 - :1 (varchar2(30), csid=852): '001358'
note:
从上面我们可以看到确实有两个执行计划,同时当前是禁用了sql acs,但bind peek功能启用,记录下两个变量值及对应的两个执行计划。
通常首先需要排除问题时间是否表有ddl,或统计信息改变,因为无人为操作且从last_ddl_time可以排除,下面看统计信息历史。
统计信息历史版本
tables statistics versions
# | table name | owner | version type | save time | last analyzed | num rows | sample size | perc | blocks | avg row len |
---|---|---|---|---|---|---|---|---|---|---|
1 | tab_qwertyxxxxxx | anbob | current | 06-may-2023 10:38:56 | 58870628 | 58870628 | 100.0 | 2734209 | 319 | |
2 | tab_qwertyxxxxxx | anbob | history | 06-may-23 10.38.57.053620 am 08:00 | 03-may-2023 19:33:02 | 58758724 | 58758724 | 100.0 | 2725002 | 319 |
3 | tab_qwertyxxxxxx | anbob | history | 03-may-23 07.33.02.744252 pm 08:00 | 29-apr-2023 19:33:36 | 58580497 | 58580497 | 100.0 | 2716818 | 319 |
4 | tab_qwertyxxxxxx | anbob | history | 29-apr-23 07.33.37.151765 pm 08:00 | 26-apr-2023 19:37:51 | 58493190 | 58493190 | 100.0 | 2716818 | 319 |
5 | tab_qwertyxxxxxx | anbob | history | 26-apr-23 07.37.51.237423 pm 08:00 | 22-apr-2023 19:33:59 | 58316831 | 58316831 | 100.0 | 2708634 | 319 |
6 | tab_qwertyxxxxxx | anbob | history | 22-apr-23 07.33.59.315324 pm 08:00 | 19-apr-2023 19:35:50 | 58152146 | 58152146 | 100.0 | 2700450 | 319 |
7 | tab_qwertyxxxxxx | anbob | history | 19-apr-23 07.35.50.262460 pm 08:00 | 15-apr-2023 19:31:17 | 57964514 | 57964514 | 100.0 | 2692266 | 319 |
8 | tab_qwertyxxxxxx | anbob | history | 15-apr-23 07.31.17.426153 pm 08:00 | 12-apr-2023 19:34:27 | 57816915 | 57816915 | 100.0 | 2684082 | 319 |
9 | tab_qwertyxxxxxx | anbob | history | 12-apr-23 07.34.27.186380 pm 08:00 | 08-apr-2023 19:30:39 | 57603137 | 57603137 | 100.0 | 2675898 | 319 |
10 | tab_qwertyxxxxxx | anbob | history | 08-apr-23 07.30.39.822771 pm 08:00 | 05-apr-2023 19:30:15 | 57514381 | 57514381 | 100.0 | 2667714 | 319 |
11 | tab_qwertyxxxxxx | anbob | history | 05-apr-23 07.30.15.853808 pm 08:00 | 01-apr-2023 19:30:33 | 56911270 | 56911270 | 100.0 | 2643162 | 319 |
indexes statistics versions: tab_qwertyuiopas (usr)
# | index name | owner | version type | save time | last analyzed | num rows | sample size | perc | distinct keys | blevel | leaf blocks | avg leaf blocks per key | avg data blocks per key | clustering factor |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
67 | inx_extxxx3 | anbob | current | 2023-05-06/10:39:23 | 60172285 | 268149 | 0.4 | 993 | 3 | 262098 | 263 | 17318 | 17197018 | |
68 | inx_extxxx3 | anbob | history | 06-may-23 10.39.23.293618 am 08:00 | 2023-05-03/19:33:30 | 59006085 | 262952 | 0.4 | 992 | 3 | 260302 | 262 | 17363 | 17224844 |
69 | inx_extxxx3 | anbob | history | 03-may-23 07.33.30.522839 pm 08:00 | 2023-04-29/19:34:04 | 55447571 | 247094 | 0.4 | 992 | 3 | 242126 | 244 | 16703 | 16570273 |
70 | inx_extxxx3 | anbob | history | 29-apr-23 07.34.04.044406 pm 08:00 | 2023-04-26/19:38:18 | 55158097 | 245804 | 0.4 | 991 | 3 | 238311 | 240 | 16403 | 16256115 |
71 | inx_extxxx3 | anbob | history | 26-apr-23 07.38.18.026010 pm 08:00 | 2023-04-22/19:34:25 | 57960612 | 258293 | 0.4 | 990 | 3 | 252449 | 254 | 17232 | 17060135 |
72 | inx_extxxx3 | anbob | history | 22-apr-23 07.34.25.739328 pm 08:00 | 2023-04-19/19:36:18 | 55882680 | 249033 | 0.4 | 988 | 3 | 240780 | 243 | 16678 | 16478718 |
73 | inx_extxxx3 | anbob | history | 19-apr-23 07.36.18.682023 pm 08:00 | 2023-04-15/19:31:44 | 59953721 | 267175 | 0.4 | 987 | 3 | 262098 | 265 | 18054 | 17819949 |
74 | inx_extxxx3 | anbob | history | 15-apr-23 07.31.44.415922 pm 08:00 | 2023-04-12/19:34:53 | 55789779 | 248619 | 0.4 | 986 | 3 | 242351 | 245 | 16658 | 16424862 |
75 | inx_extxxx3 | anbob | history | 12-apr-23 07.34.53.972449 pm 08:00 | 2023-04-08/19:31:05 | 54001097 | 240648 | 0.4 | 986 | 3 | 238087 | 241 | 16261 | 16033736 |
76 | inx_extxxx3 | anbob | history | 08-apr-23 07.31.05.861133 pm 08:00 | 2023-04-05/19:30:43 | 56885518 | 253502 | 0.4 | 986 | 3 | 243473 | 246 | 17136 | 16896324 |
77 | inx_extxxx3 | anbob | history | 05-apr-23 07.30.43.739025 pm 08:00 | 2023-04-01/19:30:58 | 56322726 | 250994 | 0.4 | 985 | 3 | 243697 | 247 | 16768 | 16516641
|
note:
注意表和索引该时间段都没有自动收集。 因为上面有绑定变量窥探到的变量值,那我们把全表扫时的变量带进去,确认真的是收集统计信息修正了吗?
10053 trace
当我们带入001358变量值后,还可以复现全表扫描,那到这里可能有人想到了问题点。下面直接附上10053 trace file部分内容。
sql> connect / as sysdba sql> oradebug setmypid sql> oradebug unlimit sql> oradebug event 10053 trace name context forever, level 1 sql> explain plan for ...enter your query here... sql> oradebug event 10053 trace name context off sql> oradebug tracefile_name -- trace file ******************************************* peeked values of the binds in sql statement ******************************************* final query after transformations:******* unparsed query is ******* select count(*) "total_count" from "usr"."tab_qwertyxxxxxx" "o" where "o"."queryxxx">='2023/03/01 00:00:00' and "o"."queryxxx"<='2023/03/31 00:00:00' and "o"."exxxx3"='001358' and '2023/03/31 00:00:00'>='2023/03/01 00:00:00' kkoqbc: optimizing query block sel$f5bb74e1 (#0) *************************************** base statistical information *********************** table stats:: table: tab_qwertyxxxxxxalias: o (using composite stats) #rows: 58992346 #blks: 2734209 avgrowlen: 319.00 chaincnt: 0.00 index stats:: ... index: inx_extxxx3 col#: 40 lvls: 3 #lb: 251775 #dk: 994 lb/k: 253.00 db/k: 17075.00 cluf: 16973517.00 index: inx_queryxixx col#: 18 lvls: 3 #lb: 483492 #dk: 31264768 lb/k: 1.00 db/k: 1.00 cluf: 47778440.00 ... index: pk_iqueryhistory col#: 1 lvls: 3 #lb: 407799 #dk: 56939871 lb/k: 1.00 db/k: 1.00 cluf: 51116553.00 access path analysis for tab_qwertyuiopas *************************************** single table access path single table cardinality estimation for tab_qwertyuiopas[o] column (#40): newdensity:0.000057, olddensity:0.002967 bktcnt:254, popbktcnt:240, popvalcnt:24, ndv:994 column (#40): exxxx3( avglen: 7 ndv: 994 nulls: 585162 density: 0.000057 histogram: htbal #bkts: 254 uncompbkts: 254 endptvals: 38 column (#18): newdensity:0.000000, olddensity:0.000000 bktcnt:254, popbktcnt:0, popvalcnt:0, ndv:31264768 column (#18): queryxxxx( avglen: 20 ndv: 31264768 nulls: 0 density: 0.000000 histogram: htbal #bkts: 254 uncompbkts: 254 endptvals: 255 table: tab_qwertxxxxas alias: o card: original: 58992346.000000 rounded: 79668 computed: 79667.56 non adjusted: 79667.56 access path: tablescan cost: 743146.94 resp: 743146.94 degree: 0 cost_io: 740517.00 cost_cpu: 77421299057 resp_io: 740517.00 resp_cpu: 77421299057 access path: index (alleqrange) index: inx_extxxx3 resc_io: 748296.00 resc_cpu: 8323150688 ix_sel: 0.043441 ix_sel_with_filters: 0.043441 cost: 748578.73 resp: 748578.73 degree: 1 access path: index (rangescan) index: inx_queryxixx resc_io: 1626415.00 resc_cpu: 13774888680 ix_sel: 0.033700 ix_sel_with_filters: 0.033700 cost: 1626882.92 resp: 1626882.92 degree: 1 ******** cost index join ******** index join: joining index inx_extxxx3 index join: joining index inx_queryxixx ix ha join outer table: tab_qwertyuiopas alias: o resc: 252234.49 card 58992346.00 bytes: 17 deg: 1 resp: 252234.49 inner table: alias: resc: 483986.54 card: 58992346.00 bytes: 30 deg: 1 resp: 483986.54 using dmeth: 2 #groups: 1 cost per ptn: 198690.65 #ptns: 1 hash_area: 256 (max=262144) buildfrag: 208836 probefrag: 302451 ppasses: 1 hash join: resc: 934911.67 resp: 934911.67 [multimatchcost=0.00] ******** index join cost ******** cost: 934911.67 ******** end index join costing ******** best:: accesspath: tablescan cost: 743146.94 degree: 1 resp: 743146.94 card: 79667.56 bytes: 0 ***************************************
note:
注意表有近6000万记录,谓词条件列exxxx3有994 distinct值,可见重复很高,列上有等高直方图,从uncompbkts: 254 endptvals: 38 判断压缩后相差这么大判断,在254个桶中有些桶endpoint_value重复值较多,存在严重的倾斜情况。cbo计算全表扫的cost小于使用该索引,因为10053缺失这类信息,从dba_histograms可以查看列的柱状图分布。
柱状图分布height balanced histogram
查询视图dba_histograms可以得到列的柱状图,在数据库不同的版本中视图列数可能不同。主要关注这几列
note:
可见传入的变量值’001358‘刚好是柱状图的endpoint value, 但对应前面有10个桶,属于popular value, 高频值,这就导致cbo产生了比较高的cost值。
为什么执行计划变了?
前面我们有提到这个环境库启用了peek bind并禁用acs自适应游标共享,可能是因为某些原因因为该sql执行频率并不高,sql被age out出shared pool, 在sql再次执行重新reload回shared pool里需要peek变量值,然后根据变量第一次的值生成执行计划,后面再执行相同sql时,发现shared pool中存在相同文本直接使用第一次的执行计划,因为当前的acs已禁用,当然如果acs开着又可能会引入一些parse相关的问题如latch, version high.., 所以使用变量的sql第一次传入的值就很关键,有一定小运气在里面,如果第一次传入的是低频值,那就用索引。 这个现象在大师jonathan lewis书中记录柱状图章节同样有提到。
jonathan lewis ‘s 《cost-based oracle fundamentals 》
bind variable peeking
of course, things got messier when 9i introduced bind variable peeking. whenever a statement is
optimized, oracle will (in almost all cases) check the actual values of any bind variables and
optimize the statement for those specific values.parsing and optimizing
when an sql statement is first executed, it has to be checked for syntax, interpreted, and optimized. thereafter
if the same piece of text is fired at the database again, it may be recognized as previously used, in which
case the existing execution plan may be tracked down and reused.
however, even when a statement is still in memory, some of the information about the execution plan
may become invalid, or may get pushed out of memory by the standard memory management lru routines.
when this happens, the statement will have to be reoptimized. (you can detect this from the loads and
invalidations columns in v$sql, summarized in the reloads and invalidations columns of
v$librarycache. reloads occur when information is lost from memory; invalidations occur when some of
the dependent information changes.)
the reuse (or sharing) of sql is generally a good thing—but if the execution plan generated on the first
use of a statement caters to an unlucky set of values, then every subsequent execution of that statement will
follow the same unlucky execution plan until you can force the statement out of memory—perhaps by the
extreme method of flushing the shared pool.
bind variable peeking has the unfortunate side effect of making it very easy for one user to introduce an
execution plan that is bad news for every other user for an arbitrary amount of time, until that execution plan
happens to get flushed from the shared pool for some reason.
我们看一下sql是否有age out出shared pool。
sql first load
performance metrics of child cursors of 4ruhsafdgfj0x while still in memory.
# | inst id | child num | plan hv | execs | fetch | loads | inval | parse calls | buffer gets | optimizer mode | cost | opt env hv | parsing schema name | module | first load | last load | last active |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 1 | 0 | 1429561684 | 26 | 26 | 2 | 1 | 26 | 9138 | all_rows | 1058 | 642354766 | usr | jdbc thin client | 2023-05-06/10:38:42 | 2023-05-06/10:40:12 | 2023-05-06/10:40:12 |
1 | 2 | 0 | 1429561684 | 124 | 124 | 12 | 3 | 124 | 105242 | all_rows | 1058 | 642354766 | usr | jdbc thin client | 2023-05-06/01:02:24 | 2023-05-06/10:39:55 | 2023-05-06/10:40:11 |
plan hv | avg elapsed time (secs) | avg cpu time (secs) | avg io time (secs) | total execs | min cost | max cost | first snapshot | last snapshot |
---|---|---|---|---|---|---|---|---|
1 | 1429561684 | 2.129 | 0.302 | 1 | 10741 | 1123 | 2023-04-10/01:00:16 | 2023-05-06/01:30:12 |
2 | 1634677993 | 774.059 | 67.064 | 1 | 44 | 740643 | 2023-05-06/01:30:12 | 2023-05-06/10:30:15 |
awr
sga breakdown differencez
- ordered by pool, name
- n/a value for begin mb or end mb indicates the size of that pool/name was insignificant, or zero in that snapshot
pool | name | begin mb | end mb | % diff |
---|---|---|---|---|
java | free memory | 896.00 | 896.00 | 0.00 |
large | ctwr dba buffer | 4.79 | 4.79 | 0.00 |
large | px msg pool | 15.63 | 15.63 | 0.00 |
large | free memory | 99.29 | 99.29 | 0.00 |
large | krcc extent chunk | 7.92 | 7.92 | 0.00 |
shared | fileopenblock | 102.40 | 102.40 | 0.00 |
shared | kglh0 | 1,316.13 | 1,308.75 | -0.56 |
shared | kglhd | 200.19 | 198.73 | -0.73 |
shared | sqla | 1,494.70 | 1,505.73 | 0.74 |
shared | db_block_hash_buckets | 356.00 | 356.00 | 0.00 |
shared | free memory | 1,660.42 | 1,662.17 | 0.11 |
shared | gcs resources | 1,090.61 | 1,090.61 | 0.00 |
shared | gcs shadows | 755.04 | 755.04 | 0.00 |
shared | kglsim object batch | 127.54 | 127.54 | 0.00 |
streams | free memory | 895.94 | 895.94 | 0.00 |
buffer_cache | 37,888.00 | 37,888.00 | 0.00 | |
fixed_sga | 2.16 | 2.16 | 0.00 | |
log_buffer | 39.54 | 39.54 | 0.00 |
library cache activity
- “pct misses” should be very low
namespace | get requests | pct miss | pin requests | pct miss | reloads | invali- dations |
---|---|---|---|---|---|---|
account_status | 1,392 | 0.00 | 0 | 0 | 0 | |
body | 1,290 | 0.00 | 1,744 | 0.00 | 0 | 0 |
cluster | 20 | 5.00 | 20 | 5.00 | 0 | 0 |
dblink | 1,437 | 0.00 | 0 | 0 | 0 | |
edition | 488 | 0.00 | 946 | 0.00 | 0 | 0 |
index | 1,113 | 0.00 | 1,113 | 0.00 | 0 | 0 |
queue | 9 | 0.00 | 399 | 0.00 | 0 | 0 |
schema | 539 | 0.00 | 0 | 0 | 0 | |
sql area | 23,179 | 50.64 | 166,761 | 21.17 | 53 | 0 |
sql area build | 11,809 | 99.70 | 0 | 0 | 0 | |
sql area stats | 11,804 | 99.69 | 11,804 | 99.69 | 0 | 0 |
table/procedure | 26,143 | 0.20 | 83,245 | 0.27 | 56 | 0 |
note:
基本上可以判断当时有新的sql执行,把执行频率少的sql age out出内存,而刚好第一次传入的变量是一个高频值,选择了全表扫,后续所有的sql继续沿用这个执行计划,直到人工执行了统计信息收集,使shared pool的该sql 失效,再次解析时传入的值为低频的变量值,执行计划改为索引扫描。