故障现象
一早来,业务问我,数据库有问题吗,发了一个类似下面的截图说用户连不上了。
[oracle@rac1 ~]$ sqlplus zylong/zylong@192.168.0.10:1521/orcl
sql*plus: release 19.0.0.0.0 - production on mon nov 1 09:53:46 2021
version 19.11.0.0.0
米乐app官网下载 copyright (c) 1982, 2020, oracle. all rights reserved.
<<<<<<<<<<<<<<<<<<<<<<<<< # 挂起,超时显示ora-04021报错
error:
ora-04021: timeout occurred while waiting to lock object 107
当时还没有报出ora-04021报错,一直挂着,于是我就对数据库检查了一番,查本地磁盘空间正常,归档空间正常,用户状态正常,监听正常,数据库alert日志也正常,使用其他用户登录正常,就这个用户不能登录。
查询此用户的等待事件,大量的library cache lock
sql> set numw 20
sql> select con_id,username,event,p1,p2,p3,last_call_et from v$session where username=upper('zylong') and status!='inactive';
con_id username event p1 p2 p3 last_call_et
------ ----------------------- ------------------------------------ -------------------- -------------------- -------------------- --------------------
7 zylong library cache lock 45015708272 44295738072 8323074 753
7 zylong library cache lock 45015708272 34047265144 8323074 753
7 zylong library cache lock 45015708272 73865793920 8323074 753
7 zylong library cache lock 45015708272 29450117360 8323074 746
7 zylong library cache lock 45015708272 34649553984 8323074 753
7 zylong library cache lock 45015708272 47329041440 8323074 753
7 zylong library cache lock 45015708272 44281610752 8323074 753
7 zylong library cache lock 45015708272 44680843968 8323074 753
7 zylong library cache lock 45015708272 29088738304 8323074 753
7 zylong library cache lock 45015708272 45463714896 8323074 753
7 zylong library cache lock 45015708272 11763471408 8323074 753
7 zylong library cache lock 45015708272 7089814448 8323074 753
7 zylong library cache lock 45015708272 7190841040 8323074 753
7 zylong library cache lock 45015708272 45727694264 8323074 753
7 zylong library cache lock 45015708272 29514801880 8323074 753
7 zylong library cache lock 45015708272 45400941088 8323074 753
7 zylong library cache lock 45015708272 46857941256 8323074 753
7 zylong library cache lock 45015708272 28912446224 8323074 753
7 zylong sql*net message to client 1413697536 1 0 0
7 zylong library cache lock 45015708272 73782262016 8323074 753
7 zylong gc cr request 313 2241577 1 0
7 zylong library cache lock 45015708272 55249461128 8323074 753
7 zylong library cache lock 45015708272 45011572904 8323074 222
7 zylong library cache lock 45015708272 11359884584 8323074 753
7 zylong library cache lock 45015708272 33929685512 8323074 753
7 zylong library cache lock 45015708272 45322411920 8323074 753
7 zylong library cache lock 45015708272 6259354512 8323074 322
7 zylong library cache lock 45015708272 7434264256 8323074 753
7 zylong library cache lock 45015708272 34761525680 8323074 753
7 zylong enq: tx - row lock contention 1415053318 2097163 1749981 3683
7 zylong library cache lock 45015708272 7483659600 8323074 753
7 zylong library cache lock 45015708272 53117430168 8323074 753
7 zylong library cache lock 45015708272 29114680592 8323074 754
7 zylong library cache lock 45015708272 35259998832 8323074 753
7 zylong library cache lock 45015708272 44815780096 8323074 753
7 zylong library cache lock 45015708272 12702913336 8323074 753
7 zylong library cache lock 45015708272 7311180264 8323074 753
7 zylong library cache lock 45015708272 74195848184 8323074 753
7 zylong library cache lock 45015708272 29050270464 8323074 753
7 zylong library cache lock 45015708272 74570849968 8323074 753
7 zylong library cache lock 45015708272 29438472736 8323074 399
7 zylong library cache lock 45015708272 46973671688 8323074 753
7 zylong library cache lock 45015708272 73576444472 8323074 753
7 zylong library cache lock 45015708272 28788937488 8323074 753
7 zylong library cache lock 45015708272 7054446560 8323074 265
7 zylong library cache lock 45015708272 74444654688 8323074 867
7 zylong library cache lock 45015708272 73610426000 8323074 35
7 zylong library cache lock 45015708272 45806155072 8323074 286
7 zylong library cache lock 45015708272 34497853112 8323074 13
7 zylong library cache lock 45015708272 73554239120 8323074 585
7 zylong library cache lock 45015708272 29505514120 8323074 586
7 zylong library cache lock 45015708272 73977126296 8323074 803
7 zylong library cache lock 45015708272 8521679328 8323074 285
7 zylong library cache lock 45015708272 11542066912 8323074 773
7 zylong library cache lock 45015708272 34491470232 8323074 18
7 zylong library cache lock 45015708272 12456319768 8323074 815
7 zylong library cache lock 45015708272 28327281600 8323074 783
7 zylong library cache lock 45015708272 6263935184 8323074 635
7 zylong library cache lock 45015708272 28055520024 8323074 815
7 zylong library cache lock 45015708272 29469195360 8323074 803
7 zylong library cache lock 45015708272 29269468184 8323074 883
7 zylong library cache lock 45015708272 28039214464 8323074 803
7 zylong library cache lock 45015708272 8131252688 8323074 754
7 zylong library cache lock 45015708272 8457524272 8323074 754
7 zylong library cache lock 45015708272 74579988512 8323074 753
7 zylong library cache lock 45015708272 7183580904 8323074 754
7 zylong library cache lock 45015708272 8292110304 8323074 753
7 zylong library cache lock 45015708272 47530526208 8323074 656
68 rows selected.
故障分析
sql> oradebug setmypid statement processed. sql> oradebug tracefile_name /u02/app/oracle/diag/rdbms/cdbname/cdbname1/trace/cdbname1_ora_40833.trc sql> oradebug hanganalyze 3 hang analysis in /u02/app/oracle/diag/rdbms/cdbname/cdbname1/trace/cdbname1_ora_40833.trc sql> oradebug -g all hanganalyze 3 hang analysis in /u02/app/oracle/diag/rdbms/cdbname/cdbname1/trace/cdbname1_diag_27412.trc
chains most likely to have caused the hang:
[a] chain 1 signature: 'sql*net message from client'<='enq: tx - row lock contention'<='library cache lock'
chain 1 signature hash: 0xc1d502c7
[b] chain 2 signature: 'cpu or wait cpu'<='px deq: execute reply'
chain 2 signature hash: 0x822b6140
[c] chain 3 signature: 'sql*net message from client'<='enq: tx - row lock contention'<='library cache lock'
chain 3 signature hash: 0xc1d502c7
-------------------------------------------------------------------------------
chain 1:
-------------------------------------------------------------------------------
oracle session identified by:
{
instance: 1 (cdbname.cdbname1)
os id: 107393
process id: 962, oracle@rac1
session id: 103
session serial #: 1876
module name: 7 (oracle@rac1tr@zcloud (tns v1-v3))))
pdb id: 7 (orcl)
}
is waiting for 'library cache lock' with wait info:
{
p1: 'handle address'=0xa7b253270
p2: 'lock address'=0x1138f73a70
p3: '100*mode namespace'=0x7f0002
name: orcl.107 [nsp: 127 typ: 255 id: 0]
px1: 'pdb uid'=0x53556d15
px2: 'master hint'=0x0
time in wait: 1 min 3 sec
timeout after: 13 min 56 sec
wait id: 15
blocking: 0 sessions
current sql_id: 0
current sql:
short stack: ksedsts() 426<-ksdxfstk() 58<-ksdxcb() 872<-sspuser() 223<-__sighandler()<-semtimedop() 10<-skgpwwait() 187<-ksliwat() 2230<-kslwaitctx() 200<-kjusuc() 4750<-ksipgetctxia() 1980<-ksixpgetia() 124<-kqlmlock() 805<-kqlmclusterlock() 1093<-kgllkal() 1058<-kgllock() 1415<-kglget() 293<-kziaretrievelslt() 298<-kpolnb() 2876<-kpoauth() 1805<-opiodr() 1202<-ttcpip() 1239<-opitsk() 1897<-opiino() 936<-opiodr() 1202<-opidrv() 1094<-sou2o() 165<-opimai_real() 422<-ssthrdmain() 417<-main() 256<-__libc_start_main() 245
wait history:
* time between current wait and wait #1: 0.000518 sec
1. event: ' pga memory operation'
time waited: 0.000002 sec
wait id: 14 p1: ''=0x10000
p2: ''=0x1
* time between wait #1 and #2: 0.000489 sec
2. event: 'pga memory operation'
time waited: 0.000003 sec
wait id: 13 p1: ''=0x30000
p2: ''=0x1
* time between wait #2 and #3: 0.000218 sec
3. event: 'pga memory operation'
time waited: 0.000001 sec
wait id: 12 p1: ''=0x10000
p2: ''=0x1
}
and is blocked by
=> oracle session identified by:
{
instance: 1 (cdbname.cdbname1)
os id: 22834
process id: 202, oracle@rac1
session id: 2019
session serial #: 34291
module name: 7 (oracle@rac1tzcloud (tns v1-v3)3))))
pdb id: 7 (orcl)
}
which is waiting for 'enq: tx - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x20000b
p3: 'sequence'=0x1ab3dd
px1: 'pdb uid'=0x53556d15
px2: 'master hint'=0x0
time in wait: 64 min 53 sec
timeout after: never
wait id: 32
blocking: 137 sessions
current sql_id: 2679977578
current sql: update user$ set spare6=decode(to_char(:2, 'yyyy-mm-dd'), '0000-00-00', to_date(null), :2) where user#=:1
short stack: ksedsts() 426<-ksdxfstk() 58<-ksdxcb() 872<-sspuser() 223<-__sighandler()<-semtimedop() 10<-skgpwwait() 187<-ksliwat() 2230<-kslwaitctx() 200<-kjusuc() 4750<-ksipgetctxia() 1980<-ksqcmi() 2757<-ksqgtlctx() 6620<-ksqgelctx() 838<-ktugettxforxid() 324<-ktcwit1() 502<-kdddgb() 5858<-kdusru() 458<-kauupd() 356<-updrow() 1734<-qerupupdrow() 671<-qerupfetch() 594<-updaul() 1416<-updthreephaseexe() 340<-updexe() 443<-opiexe() 11741<-kpoal8() 2226<-opiodr() 1202<-kpoodrc() 40<-rpiswu2() 2077<-kpoodr() 660<-upirtrc() 2783<
wait history:
* time between current wait and wait #1: 0.000046 sec
1. event: 'ipc group service call'
time waited: 0.000003 sec
wait id: 31 p1: 'function id'=0xb
* time between wait #1 and #2: 0.000218 sec
2. event: 'gc cr block 2-way'
time waited: 0.000330 sec
wait id: 30 p1: ''=0x124
p2: ''=0x64
p3: ''=0x4f
* time between wait #2 and #3: 0.000037 sec
3. event: 'ipc group service call'
time waited: 0.000001 sec
wait id: 29 p1: 'function id'=0xb
}
and is blocked by
=> oracle session identified by:
{
instance: 2 (cdbname.cdbname2)
os id: 13073
process id: 927, oracle@rac2
session id: 6105
session serial #: 18816
module name: 7 (pl/sql developerzcloud (tns v1-v3)3))
pdb id: 7 (orcl)
}
which is waiting for 'sql*net message from client' with wait info:
{
p1: 'driver id'=0x54435000
p2: '#bytes'=0x1
time in wait: 64 min 52 sec
timeout after: never
wait id: 1782
blocking: 138 sessions
current sql_id: 0
current sql:
short stack: ksedsts() 426<-ksdxfstk() 58<-ksdxcb() 872<-sspuser() 223<-__sighandler()<-read() 14<-snttread() 16<-nttfprd() 354<-nsbasic_brc() 399<-nioqrc() 438<-opikndf2() 999<-opitsk() 907<-opiino() 936<-opiodr() 1202<-opidrv() 1094<-sou2o() 165<-opimai_real() 422<-ssthrdmain() 417<-main() 256<-__libc_start_main() 245
wait history:
* time between current wait and wait #1: 0.000172 sec
1. event: ' enq: ps - contention'
time waited: 0.000118 sec
wait id: 1781 p1: 'name|mode'=0x50530006
p2: 'instance'=0x2
p3: 'slave id'=0x1
* time between wait #1 and #2: 0.000031 sec
2. event: 'px deq: slave session stats'
time waited: 0.000001 sec
wait id: 1780 p1: 'sleeptime/senderid'=0x0
p2: 'passes'=0x0
* time between wait #2 and #3: 0.000006 sec
3. event: 'px deq: slave session stats'
time waited: 0.000022 sec
wait id: 1779 p1: 'sleeptime/senderid'=0x1
p2: 'passes'=0x1
}
chain 1 signature: 'sql*net message from client'<='enq: tx - row lock contention'<='library cache lock'
chain 1 signature hash: 0xc1d502c7
-------------------------------------------------------------------------------
发现了 update user$ set spare6=decode(to_char(:2, ‘yyyy-mm-dd’), ‘0000-00-00’, to_date(null), :2) where user#=:1
故障处理
sql> @usid 6105
username sid audsid osuser machine program spid opid cpid lastcall status logon_time
---------- -------------- ----------- ---------------- --------- -------------- ------- ------ ---------- ---------- -------- -----------------
zylong '6105,18816' 31629560 administrator \pc2018 plsqldev.exe 13073 927 3604:5596 4003 inactive 20211101 08:42:32
sql> @kill sid=6105
commands_to_verify_and_run
--------------------------------------------------------------------------------------------------------------
alter system kill session '6105,18816' immediate -- zylong@workgroup\pc2018 (plsqldev.exe);
sql> alter system kill session '6105,18816' immediate -- zylong@workgroup\pc2018 (plsqldev.exe);
system altered.
通过监听日志找到这个主机的ip地址
[oracle@rac2 trace]$ cat listener.log |grep pc2018|head -10
01-nov-2021 08:39:58 * (connect_data=(service_name=orcl)(cid=(program=c:\program?files\plsql?developer\plsqldev.exe)
(host=pc2018)(user=administrator))(server=dedicated)(instance_name=cdbname2)) * (address=(protocol=tcp)(host=10.10.10.17)(port=50662)) * establish * orcl * 0
故障原因
客户端\pc2018更新了 last login time (update user$ set spare6=decode(to_char(:2, ‘yyyy-mm-dd’), ‘0000-00-00’, to_date(null), :2) where user#=:1) ,不知道啥原因没有提交,锁住基表user$,导致其他session登录时无法更新user$。
更详细的解释,请参考我们大神的博客
- oracle 12c new feature: last login time for non-sys users
[oracle@rac1 ~]$ sqlplus zylong/zylong@192.168.0.10:1521/orcl
sql*plus: release 19.0.0.0.0 - production on mon nov 1 11:42:25 2021
version 19.11.0.0.0
米乐app官网下载 copyright (c) 1982, 2020, oracle. all rights reserved.
last successful login time: mon nov 01 2021 11:42:25 08:00 <<<<# 记录在user$表的spare6列.
connected to:
oracle database 19c enterprise edition release 19.0.0.0.0 - production
version 19.11.0.0.0
sql>
# oracle不显示sys或操作系统认证的登录
sql> @desc dba_users
name null? type
------------------------------- -------- ----------------------------
1 username not null varchar2(128)
2 user_id not null number
3 password varchar2(4000)
4 account_status not null varchar2(32)
5 lock_date date
6 expiry_date date
7 default_tablespace not null varchar2(30)
8 temporary_tablespace not null varchar2(30)
9 local_temp_tablespace varchar2(30)
10 created not null date
11 profile not null varchar2(128)
12 initial_rsrc_consumer_group varchar2(128)
13 external_name varchar2(4000)
14 password_versions varchar2(17)
15 editions_enabled varchar2(1)
16 authentication_type varchar2(8)
17 proxy_only_connect varchar2(1)
18 common varchar2(3)
19 last_login timestamp(9) with time zone
20 oracle_maintained varchar2(1)
21 inherited varchar2(3)
22 default_collation varchar2(100)
23 implicit varchar2(3)
24 all_shard varchar2(3)
25 password_change_date date
sql> select to_char(last_login,'yyyy-mm-dd hh24:mi:ss') from dba_users where username='zylong';
to_char(last_login,
-------------------
2021-11-01 14:20:13
如果在 login 时 hang 或 libaray cache lock ,sql 如下
update user$ set spare6=decode(to_char(:2, ‘yyyy-mm-dd’), ‘0000-00-00’, to_date(null), :2) where user#=:1
user$.spare6对应的就是last_login , oracle已经对12c和19c推出了bug 补丁用入引入禁用lslt, 需要安装patch 32164034 才会引入_disable_last_successful_login_time 。
-
oracle19c 建议的 one-off patch之disable last successful login time
-
对于登录相关的 last successful login time的library cache lock又出新参数