最近遇到一个 case,在关闭数据库的时候报 ora-01210: data file header is media corrupt
sys@orcl 09:11:53> shutdown immediate
ora-01122: database file 5 failed verification check
ora-01110: data file 5: '/oradata/orcl/tbs01.dbf'
ora-01210: data file header is media corrupt
通过查看 alter 日志发现,报错的数据文件在两个数据库实例中都有创建tbs成功的语句中,类似如下:
completed: create tablespace tbs datafile '/oradata/orcl/tbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m;
可以看到,都没有使用 reuse 关键字。以下是在我本地环境模拟。
数据库的版本是 19.5,由于目前19.5ru需要密码才能下载,我在本地19.3 和19.17 环境分别模拟如下:
–19.3
在 hhhh 实例上创建 hhtbs01 表空间,数据文件是 /u01/hhdata/hhhh/datafile/hhtbs01.dbf
create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m;
hhhh:sys@(301_hhhh)>
hhhh:sys@(301_hhhh)> alter system switch logfile;
system altered.
hhhh:sys@(301_hhhh)> create table h (i int) tablespace hhtbs01;
table created.
hhhh:sys@(301_hhhh)> insert into h values(100);
1 row created.
hhhh:sys@(301_hhhh)> c/100/101
1* insert into h values(101)
hhhh:sys@(301_hhhh)> /
1 row created.
hhhh:sys@(301_hhhh)> commit;
commit complete.
hhhh:sys@(301_hhhh)> insert into h select * from h;
2 rows created.
hhhh:sys@(301_hhhh)> /
4 rows created.
hhhh:sys@(301_hhhh)> /
8 rows created.
hhhh:sys@(301_hhhh)> /
16 rows created.
hhhh:sys@(301_hhhh)>
并插入一些数据。
oradb01 instance 也创建 hhtbs01 表空间,数据文件是 /u01/hhdata/hhhh/datafile/hhtbs01.dbf
oradb01:sys@(209_cdb$root)> create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m;
create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m
*
error at line 1:
ora-01119: error in creating database file '/u01/hhdata/hhhh/datafile/hhtbs01.dbf'
ora-27038: created file already exists
additional information: 1
很显然由于没有reuse
关键字,无法创建成功的。但是根据 case 的alert 日志,很明显两个实例在不同的时间点,都执行创建表空间成功了,且都是同一个数据文件。
create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m
completed: create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m
这里推测应该是手工将之前的文件进行了rm,不然第二个实例是无法创建成功的,即使通过 touch 一个空的数据文件,create tbs 不使用reuse
关键字也是无法创建成功的。
oggscdg:sys@(963_cdb$root)> !touch /u01/hhdata/hhhh/datafile/htouch.dbf
oggscdg:sys@(963_cdb$root)> !ls -l /u01/hhdata/hhhh/datafile/htouch.dbf
-rw-r--r--. 1 oracle oinstall 0 dec 21 09:35 /u01/hhdata/hhhh/datafile/htouch.dbf
oggscdg:sys@(963_cdb$root)> create tablespace htouch datafile '/u01/hhdata/hhhh/datafile/htouch.dbf' size 100m autoextend on next 1000m maxsize 32767m;
oradb01:sys@(963_cdb$root)> create tablespace htouch datafile '/u01/hhdata/hhhh/datafile/htouch.dbf' size 100m autoextend on next 1000m maxsize 32767m;
create tablespace htouch datafile '/u01/hhdata/hhhh/datafile/htouch.dbf' size 100m autoextend on next 1000m maxsize 32767m
*
error at line 1:
ora-01119: error in creating database file '/u01/hhdata/hhhh/datafile/htouch.dbf'
ora-27038: created file already exists
additional information: 1
文件已经存在的情况下,只有使用 reuse
关键字才能创建成功
oradb01:sys@(963_cdb$root)> create tablespace htouch datafile '/u01/hhdata/hhhh/datafile/htouch.dbf' size 100m reuse autoextend on next 1000m maxsize 32767m;
tablespace created.
同时查看 alert 日志里清晰的记录 有reuse 关键字
completed: create tablespace htouch datafile '/u01/hhdata/hhhh/datafile/htouch.dbf' size 100m reuse autoextend on next 1000m maxsize 32767m
这里通过 rm 掉hhhh 实例的数据文件之后,在 oradb01 实例上,创建成功。
oradb01:sys@(209_cdb$root)> ! ls -l /u01/hhdata/hhhh/datafile/hhtbs01.dbf
-rw-r-----. 1 oracle oinstall 104865792 dec 21 08:27 /u01/hhdata/hhhh/datafile/hhtbs01.dbf
oradb01:sys@(209_cdb$root)> ! rm /u01/hhdata/hhhh/datafile/hhtbs01.dbf
oradb01:sys@(209_cdb$root)> ! ls -l /u01/hhdata/hhhh/datafile/hhtbs01.dbf
ls: cannot access /u01/hhdata/hhhh/datafile/hhtbs01.dbf: no such file or directory
oradb01:sys@(209_cdb$root)> create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m;
tablespace created.
oradb01:sys@(209_cdb$root)>
问题出现了,我的版本是 19.3 ,在过了大概 1分钟左右,还没进行 shutdown 测试呢,hhhh 实例 crash 了。
2022-12-21t08:27:03.855326 08:00
create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m
completed: create tablespace hhtbs01 datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' size 100m autoextend on next 1000m maxsize 32767m
2022-12-21t08:27:54.223384 08:00
thread 1 advanced to log sequence 13 (lgwr switch)
current log# 1 seq# 13 mem# 0: /u01/hhdata/hhhh/onlinelog/redo01.log
2022-12-21t08:27:54.227725 08:00
arc3 (pid:11362): archived log entry 8 added for t-1.s-12 id 0xd4895537 lad:1
2022-12-21t08:32:08.183283 08:00
read of datafile '/u01/hhdata/hhhh/datafile/hhtbs01.dbf' (fno 10) header failed with ora-01210
hex dump of (file 10, block 1) in trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc
corrupt block relative dba: 0x02800001 (file 10, block 1)
bad header found during datafile header read
data in bad block:
type: 11 format: 2 rdba: 0x04c00001
last change scn: 0x0000.0000.00000000 seq: 0x1 flg: 0x04
spare3: 0x0
consistency value in tail: 0x00000b01
check value in block header: 0xd9c0
computed block checksum: 0x0
rereading datafile 10 header failed with ora-01210
2022-12-21t08:32:08.265924 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ora-63999: data file suffered media failure
ora-01122: database file 10 failed verification check
ora-01110: data file 10: '/u01/hhdata/hhhh/datafile/hhtbs01.dbf'
ora-01210: data file header is media corrupt
2022-12-21t08:32:08.266337 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc:
ora-63999: data file suffered media failure
ora-01122: database file 10 failed verification check
ora-01110: data file 10: '/u01/hhdata/hhhh/datafile/hhtbs01.dbf'
ora-01210: data file header is media corrupt
2022-12-21t08:32:08.299887 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ora-01110: data file 2: '/u01/hhdata/hhhh/datafile/htest01.dbf'
ora-01210: data file header is media corrupt
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_ckpt_10899.trc (incident=37033):
ora-63999 [] [] [] [] [] [] [] [] [] [] [] []
incident details in: /u01/app/oracle/diag/rdbms/hhhh/hhhh/incident/incdir_37033/hhhh_ckpt_10899_i37033.trc
2022-12-21t08:32:08.479859 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ora-01110: data file 5: '/u01/hhdata/hhhh/datafile/tbs01.dbf'
ora-01210: data file header is media corrupt
2022-12-21t08:32:08.626372 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ora-01110: data file 8: '/u01/hhdata/hhhh/datafile/oth01.dbf'
2022-12-21t08:32:08.783624 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ora-01110: data file 9: '/u01/hhdata/hhhh/datafile/coth01.dbf'
ora-01210: data file header is media corrupt
2022-12-21t08:32:08.927298 08:00
errors in file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_mz00_4206.trc:
ora-01110: data file 10: '/u01/hhdata/hhhh/datafile/hhtbs01.dbf'
ora-01210: data file header is media corrupt
2022-12-21t08:32:09.252599 08:00
user (ospid: ): terminating the instance due to ora error
2022-12-21t08:32:09.350800 08:00
system state dump requested by (instance=1, osid=10899 (ckpt)), summary=[abnormal instance termination].
system state dumped to trace file /u01/app/oracle/diag/rdbms/hhhh/hhhh/trace/hhhh_diag_10880.trc
2022-12-21t08:32:10.954050 08:00
dumping diagnostic data in directory=[cdmp_20221221083209], requested by (instance=1, osid=10899 (ckpt)), summary=[abnormal instance termination].
2022-12-21t08:32:12.176920 08:00
instance terminated by user, pid = 10899
这个库我测试了多次,alert 日志有多个报 ora-01210: data file header is media corrupt 的信息,可以忽略。但是 case 遇到情况是:库可以正常运行,关闭库的时候无法正常关闭提示 :ora-01210: data file header is media corrupt。
由于找不到 19.5 的 ru 包,索性就升级到最新的19.17 进行模拟测试。
19.17 进行以上相同步骤的测试,创建之后原库正常,可以读写,过了几分钟也没有crash,查看alert日志也都正常,于是我就进行一些插入操作。在插入了 2621440 条数据之后,同样报了 ora-01122和 ora-01110但是没有报ora-01210,这次报的是 ora-01203: wrong incarnation of this file - wrong creation scn
,库的状态仍然是 open,这个表空间上的表仍然可以读。
sys@orcl 20-dec-22> create table a( i int) tablespace tbs;
table created.
elapsed: 00:00:00.38
sys@orcl 20-dec-22> insert into a values(1);
1 row created.
elapsed: 00:00:00.01
sys@orcl 20-dec-22> c/1/2
1* insert into a values(2)
sys@orcl 20-dec-22> /
1 row created.
elapsed: 00:00:00.00
sys@orcl 20-dec-22> c/2/3
1* insert into a values(3)
sys@orcl 20-dec-22> /
1 row created.
elapsed: 00:00:00.00
sys@orcl 20-dec-22> commit;
commit complete.
elapsed: 00:00:00.00
sys@orcl 20-dec-22> alter system switch logfile;
sys@orcl 07:19:44> /
/
2621440 rows created.
elapsed: 00:00:00.54
sys@orcl 07:19:45>
/
insert into a select * from a
*
error at line 1:
ora-01122: database file 5 failed verification check
ora-01110: data file 5: '/oradata/orcl/tbs01.dbf'
ora-01203: wrong incarnation of this file - wrong creation scn
关闭数据库测试,出现如下错误信息:
sys@orcl 09:59:39> shutdown immediate
ora-01122: database file 5 failed verification check
ora-01110: data file 5: '/oradata/orcl/tbs01.dbf'
ora-01203: wrong incarnation of this file - wrong creation scn
sys@orcl 09:59:44>
这里基本可以肯定是通过 rm 掉数据文件,然后再另一个数据库实例进行了表空间的重建。这里可以总结下,19.3 的时候,数据库直接crash,都无法提供服务了。19.5 会报 ora-01210: data file header is media corrupt ,数据库仍然是open状态能够提供服务(case 环境)。到了19.17 之后,报错更详细了:ora-01203: wrong incarnation of this file - wrong creation scn 而不仅仅是报文件头被损坏。
当然通过rm 之后,数据库还没关闭的情况下,句柄是仍然存在的,通过 lsof 可以查看,可以看到 dbf文件仍然存在,理论可以恢复的,下次文章再测试下如何恢复。
[oracle@dsmart:/home/oracle]$ lsof |grep delete|grep dbf
ora_dbw0_ 10767 oracle 263uw reg 253,1 104865792 917524 /oradata/orcl/tbs01.dbf (deleted)
ora_ckpt_ 10775 oracle 261u reg 253,1 104865792 917524 /oradata/orcl/tbs01.dbf (deleted)
[oracle@dsmart:/home/oracle]$