上一篇文章 说的是获取整个数据库至运行以来的top sql, 但是这个对于长时间运行的系统来说,显然是不够的,总有一些需求,是获取一段时间内的top sql. 下面列举出集中获取一段时间内的top sql的方法。
最常见的一个需求,是获取最近时间段,比如,我刚刚1分钟之内运行的top sql.
mogdb的"active session profile"功能,会以秒为单位,将活动会话收集起来,以方便性能诊断。这个数据保存在表dbe_perf.local_active_session中(要求初始化参数enable_asp=on)。
通过这个表,我们可以轻松获取最近一段时间数据库内部的top sql信息。
select count(*),unique_query_id,unique_query
from dbe_perf.local_active_session
where sample_time > sysdate - 5/24/60
group by unique_query_id,unique_query
order by 1 desc
limit 30;
上面的sql用sample_time > sysdate - 5/24/60作为限制条件,意思是当前时间5分钟内的,运行时间最长的30条sql. 当然,你也可以根据需要定制sql获取更适合你的信息,比如调整不同时间段,增加group by字段,增加其他过滤条件等。
关于local_active_session的其他列,可以参考
这里就不详细展开了。
不过值得一提的是,相对于opengauss, mogdb为了更准确的观测历史会话信息,在local_active_session里增加了plan_node_id信息,你可以观察到相关sql被采样捕获时,处于执行计划的哪一步,对性能优化来说更为方便,具体可以参考
,
顺便说一下,dbe_perf.local_active_session其实是个view, 其原始数据来自于函数get_local_active_session(), 目的是增加final_block_sessionid, 有时候访问起来会有点慢。所以,在无特殊需求的情况下,其实访问它的原始数据会更快一些。
select count(*)"sql数量",unique_query_id,unique_query
from get_local_active_session()
where sample_time > sysdate - 5/24/60
group by unique_query_id,unique_query
order by 1 desc
limit 30;
local_active_session底层其实是使用了一块内存区域,所以,需要控制总行数,默认控制为10万行。对于相对久远一点的数据,mogdb通过更稀疏的采样比(默认10秒一次),将其保存在gs_asp里面。
可以用以下sql获取local_active_session的最早时间点
select min(sample_time) from get_local_active_session();
关于这一点,可以参考kamus的这一篇文章。
因此,当我们发现,local_active_session里面最早的数据已经无法满足需求时,就需要访问gs_asp. 值得注意的是,local_active_session可以在任意database中访问,而gs_asp需要在postgres库中才能获取到数据。
select count(*)*10 "sql执行时间",unique_query_id,unique_query
from gs_asp
where sample_time > sysdate - 5/24/60
group by unique_query_id,unique_query
order by 1 desc
limit 30;
sql和访问local_active_session基本类似,里面的count(*)加了乘以10的表达式,是为了更好体现采样比的问题。
相比起上一篇文章中的statement, 活动会话历史其实是有一些小缺陷,比如,你无法知道它的准确执行次数(毕竟活动会话历史基于采样,持久化的甚至是10秒才采1次),无法知道它的逻辑读、物理读等信息,因此,有时候还是希望从statement中获取相关信息.
mogdb通过wdr功能(需要enable_wdr_snapshot=on),定期以快照形式将dbe_perf里的相关内容持久化到snapshot schema下,因此,我们可以通过snapshot里的快照信息来获取特定时间段的top sql.
与gs_asp类似,需要在postgres库中才能获取到数据。
第一步,获取需要时间对应的snapshot_id
select * from snapshot.snapshot
where start_ts between .. and ..
order by snap_id;
然后根据替换下面的sql的snap_id部分,就可以获取需要的top sql了。
with wdr_statement as
(
select snapshot_id
,snap_node_name node_name
,snap_node_id node_id
,snap_user_name user_name
,snap_user_id user_id
,snap_unique_sql_id unique_sql_id
,snap_query query
,snap_n_calls n_calls
,snap_min_elapse_time min_elapse_time
,snap_max_elapse_time max_elapse_time
,snap_total_elapse_time total_elapse_time
,snap_n_returned_rows n_returned_rows
,snap_n_tuples_fetched n_tuples_fetched
,snap_n_tuples_returned n_tuples_returned
,snap_n_tuples_inserted n_tuples_inserted
,snap_n_tuples_updated n_tuples_updated
,snap_n_tuples_deleted n_tuples_deleted
,snap_n_blocks_fetched n_blocks_fetched
,snap_n_blocks_hit n_blocks_hit
,snap_n_soft_parse n_soft_parse
,snap_n_hard_parse n_hard_parse
,snap_db_time db_time
,snap_cpu_time cpu_time
,snap_execution_time execution_time
,snap_parse_time parse_time
,snap_plan_time plan_time
,snap_rewrite_time rewrite_time
,snap_pl_execution_time pl_execution_time
,snap_pl_compilation_time pl_compilation_time
,snap_data_io_time data_io_time
,snap_last_updated last_updated
,snap_sort_count sort_count
,snap_sort_time sort_time
,snap_sort_mem_used sort_mem_used
,snap_sort_spill_count sort_spill_count
,snap_sort_spill_size sort_spill_size
,snap_hash_count hash_count
,snap_hash_time hash_time
,snap_hash_mem_used hash_mem_used
,snap_hash_spill_count hash_spill_count
,snap_hash_spill_size hash_spill_size
from
snapshot.snap_summary_statement
)
,statement1 as (select * from wdr_statement where snapshot_id =1)
,statement2 as (select * from wdr_statement where snapshot_id =2 )
select unique_sql_id,round(total_elapse_time/1e6,1) "总执行时间",n_calls "执行次数",round(total_elapse_time/n_calls/1e3,2) "单次时间"
,round(cpu_time/n_calls/1e3,2) "单次cpu时间" ,round(data_io_time/n_calls/1e3,1) "单次io时间"
,round(n_blocks_fetched/n_calls,1) "单次内存块",round((n_blocks_fetched-n_blocks_hit)/n_calls,1) "单次物理块"
,round(n_tuples_fetched/n_calls,1) "单次访问行数",round(n_tuples_returned/n_calls,1) "单次返回行数"
,substr(query,1,1024) "sql文本"
from (
select a.unique_sql_id,a. query,
a.total_elapse_time - nvl(b.total_elapse_time,0) as total_elapse_time,
a.n_calls - nvl(b.n_calls,0) as n_calls,
a.cpu_time - nvl(b.cpu_time,0) as cpu_time,
a.data_io_time - nvl(b.data_io_time,0) as data_io_time,
a.sort_time - nvl(b.sort_time,0) as sort_time,
a.n_blocks_fetched - nvl(b.n_blocks_fetched,0) as n_blocks_fetched,
a.n_blocks_hit - nvl(b.n_blocks_hit,0) as n_blocks_hit,
a.n_tuples_fetched - nvl(b.n_tuples_fetched,0) as n_tuples_fetched,
a.n_tuples_returned - nvl(b.n_tuples_returned,0) as n_tuples_returned
from statement2 a , statement1 b
where a.unique_sql_id = b.unique_sql_id( )
)
where n_calls > 10
order by 1 desc limit 30;
sql看起来异常复杂,其实可以写得更简单,其中大版篇幅在改动snapshot.snap_summary_statement的列名上面。
在一个自己控制的数据库内,完全可以创建一个view, 以简化整段sql.
create view wdr_statement as
select snapshot_id
,snap_node_name node_name
,snap_node_id node_id
,snap_user_name user_name
,snap_user_id user_id
,snap_unique_sql_id unique_sql_id
,snap_query query
,snap_n_calls n_calls
,snap_min_elapse_time min_elapse_time
,snap_max_elapse_time max_elapse_time
,snap_total_elapse_time total_elapse_time
,snap_n_returned_rows n_returned_rows
,snap_n_tuples_fetched n_tuples_fetched
,snap_n_tuples_returned n_tuples_returned
,snap_n_tuples_inserted n_tuples_inserted
,snap_n_tuples_updated n_tuples_updated
,snap_n_tuples_deleted n_tuples_deleted
,snap_n_blocks_fetched n_blocks_fetched
,snap_n_blocks_hit n_blocks_hit
,snap_n_soft_parse n_soft_parse
,snap_n_hard_parse n_hard_parse
,snap_db_time db_time
,snap_cpu_time cpu_time
,snap_execution_time execution_time
,snap_parse_time parse_time
,snap_plan_time plan_time
,snap_rewrite_time rewrite_time
,snap_pl_execution_time pl_execution_time
,snap_pl_compilation_time pl_compilation_time
,snap_data_io_time data_io_time
,snap_last_updated last_updated
,snap_sort_count sort_count
,snap_sort_time sort_time
,snap_sort_mem_used sort_mem_used
,snap_sort_spill_count sort_spill_count
,snap_sort_spill_size sort_spill_size
,snap_hash_count hash_count
,snap_hash_time hash_time
,snap_hash_mem_used hash_mem_used
,snap_hash_spill_count hash_spill_count
,snap_hash_spill_size hash_spill_size
from
snapshot.snap_summary_statement;
然后上面的sql就简化成了:
with statement1 as (select * from wdr_statement where snapshot_id =1)
,statement2 as (select * from wdr_statement where snapshot_id =2 )
select unique_sql_id,round(total_elapse_time/1e6,1) "总执行时间",n_calls "执行次数",round(total_elapse_time/n_calls/1e3,2) "单次时间"
,round(cpu_time/n_calls/1e3,2) "单次cpu时间" ,round(data_io_time/n_calls/1e3,1) "单次io时间"
,round(n_blocks_fetched/n_calls,1) "单次内存块",round((n_blocks_fetched-n_blocks_hit)/n_calls,1) "单次物理块"
,round(n_tuples_fetched/n_calls,1) "单次访问行数",round(n_tuples_returned/n_calls,1) "单次返回行数"
,substr(query,1,1024) "sql文本"
from (
select a.unique_sql_id,a. query,
a.total_elapse_time - nvl(b.total_elapse_time,0) as total_elapse_time,
a.n_calls - nvl(b.n_calls,0) as n_calls,
a.cpu_time - nvl(b.cpu_time,0) as cpu_time,
a.data_io_time - nvl(b.data_io_time,0) as data_io_time,
a.sort_time - nvl(b.sort_time,0) as sort_time,
a.n_blocks_fetched - nvl(b.n_blocks_fetched,0) as n_blocks_fetched,
a.n_blocks_hit - nvl(b.n_blocks_hit,0) as n_blocks_hit,
a.n_tuples_fetched - nvl(b.n_tuples_fetched,0) as n_tuples_fetched,
a.n_tuples_returned - nvl(b.n_tuples_returned,0) as n_tuples_returned
from statement2 a , statement1 b
where a.unique_sql_id = b.unique_sql_id( )
)
where n_calls > 10
order by 1 desc limit 30;
当然,你也可以通过内置的select generate_wdr_report()函数来生成完整的wdr报告,但可定化程度就会差了许多。
前面之所以特意修改列名,其实是为了更方便sql的重用。
比如下面一个场景:
获取某个snapshot到当前时间区间内的top sql
有时候,我们先看看最近一两个小时的top sql, 但是,wdr还没做,而我们不想去主动调用一次wdr的生成快照功能。那怎么办呢?可以拿snapshot.snap_summary_statement的数据和内存里的dbe_perf.statement数据做对比
和上一条sql相比,仅需改动一个点即可
,statement1 as (select * from wdr_statement where snapshot_id =1)
=>
,statement1 as (select * from dbe_perf.statement)
也就是
with
,statement1 as (select * from dbe_perf.statement)
,statement2 as (select * from wdr_statement where snapshot_id =2 )
select unique_sql_id,round(total_elapse_time/1e6,1) "总执行时间",n_calls "执行次数",round(total_elapse_time/n_calls/1e3,2) "单次时间"
,round(cpu_time/n_calls/1e3,2) "单次cpu时间" ,round(data_io_time/n_calls/1e3,1) "单次io时间"
,round(n_blocks_fetched/n_calls,1) "单次内存块",round((n_blocks_fetched-n_blocks_hit)/n_calls,1) "单次物理块"
,round(n_tuples_fetched/n_calls,1) "单次访问行数",round(n_tuples_returned/n_calls,1) "单次返回行数"
,substr(query,1,1024) "sql文本"
from (
select a.unique_sql_id,a. query,
a.total_elapse_time - nvl(b.total_elapse_time,0) as total_elapse_time,
a.n_calls - nvl(b.n_calls,0) as n_calls,
a.cpu_time - nvl(b.cpu_time,0) as cpu_time,
a.data_io_time - nvl(b.data_io_time,0) as data_io_time,
a.sort_time - nvl(b.sort_time,0) as sort_time,
a.n_blocks_fetched - nvl(b.n_blocks_fetched,0) as n_blocks_fetched,
a.n_blocks_hit - nvl(b.n_blocks_hit,0) as n_blocks_hit,
a.n_tuples_fetched - nvl(b.n_tuples_fetched,0) as n_tuples_fetched,
a.n_tuples_returned - nvl(b.n_tuples_returned,0) as n_tuples_returned
from statement2 a , statement1 b
where a.unique_sql_id = b.unique_sql_id( )
)
where n_calls > 10
order by 1 desc limit 30;
手动生成statement的快照,获取到当前时间区间内的top sql
wdr数据,毕竟是系统定期生成,时间点上相对不好把握,我们也可以变通一下,手动去生成 statement的快照, 并替换刚才sql里的表名。
create table statement_01121137
as select * from statement_01121137
;
然后
with
,statement1 as (select * from dbe_perf.statement)
,statement2 as (select * from statement_01121137)
select unique_sql_id,round(total_elapse_time/1e6,1) "总执行时间",n_calls "执行次数",round(total_elapse_time/n_calls/1e3,2) "单次时间"
,round(cpu_time/n_calls/1e3,2) "单次cpu时间" ,round(data_io_time/n_calls/1e3,1) "单次io时间"
,round(n_blocks_fetched/n_calls,1) "单次内存块",round((n_blocks_fetched-n_blocks_hit)/n_calls,1) "单次物理块"
,round(n_tuples_fetched/n_calls,1) "单次访问行数",round(n_tuples_returned/n_calls,1) "单次返回行数"
,substr(query,1,1024) "sql文本"
from (
select a.unique_sql_id,a. query,
a.total_elapse_time - nvl(b.total_elapse_time,0) as total_elapse_time,
a.n_calls - nvl(b.n_calls,0) as n_calls,
a.cpu_time - nvl(b.cpu_time,0) as cpu_time,
a.data_io_time - nvl(b.data_io_time,0) as data_io_time,
a.sort_time - nvl(b.sort_time,0) as sort_time,
a.n_blocks_fetched - nvl(b.n_blocks_fetched,0) as n_blocks_fetched,
a.n_blocks_hit - nvl(b.n_blocks_hit,0) as n_blocks_hit,
a.n_tuples_fetched - nvl(b.n_tuples_fetched,0) as n_tuples_fetched,
a.n_tuples_returned - nvl(b.n_tuples_returned,0) as n_tuples_returned
from statement2 a , statement1 b
where a.unique_sql_id = b.unique_sql_id( )
)
where n_calls > 10
order by 1 desc limit 30;