最近迁移中遇到一个free block 问题。模拟记录。
xtts迁移备份报错(测试环境数据):
rman-00571: ===========================================================
rman-00569: =============== error message stack follows ===============
rman-00571: ===========================================================
rman-03009: failure of backup command on ora_disk_1 channel at 11/24/2020 09:45:47
ora-19566: exceeded limit of 0 corrupt blocks for file /u01/app/oracle/oradata/orcl10/t1.dbf
之前确实有坏块,因为没有rman备份,无法直接修复。使用event 10231和dbms.repair跳过,迁移前也已经drop。这个网上教程很多,处理方式也有很多,但都没有后续。使用上述办法数据层次上已经正常,其实数据修复后这个块只是不在具体对象中,是一个free block,在下一次format前仍是坏块。
xtts调用rman,rman对空块不备份并不是直接跳过,也会检测。
!!!再次说明rman备份的重要性。有一个rman就不用这么大费周章。
1、 bbed制造一个空块替换 (慎用)
2、 allocate extent datafile方式覆盖 (推荐)
3、 resize或10g环境中的truncate或许有用,但都是有限制的,本来也不是针对这个问题的处理方式。成功率并不高
说的有点绕,模拟一下就完事了。
模拟坏块及没有rman备份的修复
1、新建一个表插入6行数据。通过 rowid查看测试数据的文件号(rel_fno)、块号( blockno ,)和行号(rowno)
select rowid,
dbms_rowid.rowid_relative_fno(rowid) rel_fno,
dbms_rowid.rowid_block_number(rowid) blockno,
dbms_rowid.rowid_row_number(rowid) rowno
from t11;
rowid rel_fno blockno rowno
------------------ ---------- ---------- ----------
aaampcaafaaaaaoaaa 5 14 0
aaampcaafaaaaaoaab 5 14 1
aaampcaafaaaaaoaac 5 14 2
aaampcaafaaaaaoaad 5 14 3
aaampcaafaaaaaoaae 5 14 4
aaampcaafaaaaaoaaf 5 14 5
2、使用bbed对file#=5,block#=14进行破坏
bbed网上教程也很多,简单记录下:
set dba 5,14
dump /v dba 5,14 offset 0
modify /x 12345678 dba 5,14 offset 0 --修改
sum dba 5,14 apply 提交
3、测试数据是否正常
alter system flush buffer_cache; -数据已经缓存到内存中,手工刷新,不然不会报错
查询提示坏块:
09:42:48 sql> select * from t11;
select * from t11
*
error at line 1:
ora-01578: oracle data block corrupted (file # 5, block # 14)
ora-01110: data file 5: '/u01/app/oracle/oradata/orcl10/t1.dbf'
4、修复坏块
修复坏块方法很多:
- event 10231
- dbms.repair
- truncate 重新导入
- drop 重建
- rebulid 索引重建
09:42:57 sql> truncate table t11;
table truncated.
09:43:04 sql> select * from t11;
no rows selected
确实可以了查询了,之前查行号显示6行数据,这里0行,没有备份的修复代价就是这部分数据丢失。再次提醒备份的重要性。
网上大部分的教程也都是这一部分,基本到这就结束了。这个工作真的彻底完成了吗?
5、dbv与rman检测
- dba检测
[oracle@rhel6 ~]$ dbv file='/u01/app/oracle/oradata/orcl10/t1.dbf'
dbverify: release 10.2.0.4.0 - production on tue nov 24 09:47:02 2020
米乐app官网下载 copyright (c) 1982, 2007, oracle. all rights reserved.
dbverify - verification starting : file = /u01/app/oracle/oradata/orcl10/t1.dbf
page 14 is marked corrupt
corrupt block relative dba: 0x0140000e (file 5, block 14)
bad header found during dbv:
data in bad block:
type: 18 format: 4 rdba: 0x0140000e
last change scn: 0x0000.000959ff seq: 0x8 flg: 0x06
spare1: 0x56 spare2: 0x78 spare3: 0x0
consistency value in tail: 0x59ff0608
check value in block header: 0x588c
computed block checksum: 0x0
dbverify - verification complete
total pages examined : 1280
total pages processed (data) : 4
total pages failing (data) : 0
total pages processed (index): 0
total pages failing (index): 0
total pages processed (other): 11
total pages processed (seg) : 0
total pages failing (seg) : 0
total pages empty : 1264
**total pages marked corrupt : 1**
total pages influx : 0
highest block scn : 613140 (0.613140)
- rman检测
backup validate check logical datafile 5;
--校验后查询
09:50:36 sql> select * from v$database_block_corruption;
file# block# blocks corruption_change# corruptio
---------- ---------- ---------- ------------------ ---------
5 14 1 0 corrupt
--还不死心,直接备份:
rman> backup datafile 5;
starting backup at 24-nov-20
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=145 devtype=disk
channel ora_disk_1: starting full datafile backupset
channel ora_disk_1: specifying datafile(s) in backupset
input datafile fno=00005 name=/u01/app/oracle/oradata/orcl10/t1.dbf
channel ora_disk_1: starting piece 1 at 24-nov-20
rman-00571: ===========================================================
rman-00569: =============== error message stack follows ===============
rman-00571: ===========================================================
rman-03009: failure of backup command on ora_disk_1 channel at 11/24/2020 09:45:47
ora-19566: exceeded limit of 0 corrupt blocks for file /u01/app/oracle/oradata/orcl10/t1.dbf
基本验证了上述理论。重建方式或手工标记只是解决了数据上的。坏块还是存在的,被置为free block。如果确认以后不再使用rman的情况下,也可以这么做。
1、根据以上dbv或rman信息确定坏块信息
- 文件号: file#=5
- 块 号: block#=14
- 文件名: /u01/app/oracle/oradata/orcl10/t1.dbf --t1表空间
2、检查块是否是某个对象的一部分
这个查询在大库中会很慢,尝试从v$bh中抓取失败,等一会。
select segment_name, segment_type, owner
from dba_extents
where file_id = 5
and 14 between block_id
and block_id blocks -1;
no rows selected --dba_extents已经查不到了,不属于任何对象
3、如果块不属于任何对象,查询dba_free_space确认块是否属于数据文件的可用空间
select * from dba_free_space where file_id= 5
and 14 between block_id and block_id blocks -1;
tablespace_name file_id block_id bytes blocks relative_fno
------------------------------ ---------- ---------- ---------- ---------- ------------
t1 5 9 65536 8 5
运行以下sql,确认块是位于可用空间中还是已占用空间中
这sql也很慢,模拟的时候可以执行找到原因,方便理解。生产修复可以略过。
set lines 200 pages 10000
col segment_name format a30
select e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
, greatest(e.block_id, c.block#) corr_start_block#
, least(e.block_id e.blocks-1, c.block# c.blocks-1) corr_end_block#
, least(e.block_id e.blocks-1, c.block# c.blocks-1)
- greatest(e.block_id, c.block#) 1 blocks_corrupted
, null description
from dba_extents e, v$database_block_corruption c
where e.file_id = c.file#
and e.block_id <= c.block# c.blocks - 1
and e.block_id e.blocks - 1 >= c.block#
union
select s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
, header_block corr_start_block#
, header_block corr_end_block#
, 1 blocks_corrupted
, 'segment header' description
from dba_segments s, v$database_block_corruption c
where s.header_file = c.file#
and s.header_block between c.block# and c.block# c.blocks - 1
union
select null owner, null segment_type, null segment_name, null partition_name, c.file#
, greatest(f.block_id, c.block#) corr_start_block#
, least(f.block_id f.blocks-1, c.block# c.blocks-1) corr_end_block#
, least(f.block_id f.blocks-1, c.block# c.blocks-1)
- greatest(f.block_id, c.block#) 1 blocks_corrupted
, 'free block' description
from dba_free_space f, v$database_block_corruption c
where f.file_id = c.file#
and f.block_id <= c.block# c.blocks - 1
and f.block_id f.blocks - 1 >= c.block#
order by file#, corr_start_block#;
--最后列,free block
owner segment_type segment_name partition_name file# corr_start_block# corr_end_block# blocks_corrupted description
------------------------------ ------------------ ------------------------------ ------------------------------ ---------- ----------------- --------------- ---------------- --------------
5 14 14 1 free block
4、创建中间表
创建一个表 ,该表位于出现坏块的表空间t1中,使用nologging选项,避免生成redo
conn t1/t1
create table s (n number,c varchar2(4000)) nologging;
--验证表是否创建在了正确的表空间
select segment_name,tablespace_name from user_segments where segment_name='s' ;
segment_name tablespace_name
------------------------------ ------------------------------
s t1
5、在表上创建触发器
根据dbv的结果输入变量,块号14 ,数据文件5
conn / as sysdba
create or replace trigger corrupt_trigger
after insert on t1.s
referencing old as p_old new as new_p
for each row
declare
corrupt exception;
begin
if (dbms_rowid.rowid_block_number(:new_p.rowid)=&blocknumber)
and (dbms_rowid.rowid_relative_fno(:new_p.rowid)=&filenumber) then
raise corrupt;
end if;
exception
when corrupt then
raise_application_error(-20000, 'corrupt block has been formatted');
end;
/
6、计算分配空间
select bytes/1024 from dba_free_space where file_id=5 and 1280 between block_id and block_id blocks -1;
bytes/1024
----------
64
为受影响的数据文件中的表分配空间,这里计算下来是64k。
7、 关闭datafile 自动扩展
先查询下,坏块修复在改回原值保持一致。
select file_name,autoextensible from dba_data_files where file_name='/u01/app/oracle/oradata/orcl10/t1.dbf';
alter database datafile '/u01/app/oracle/oradata/orcl10/t1.dbf' autoextend off;
8、分配空间
alter table t1.s allocate extent (datafile '/u01/app/oracle/oradata/orcl10/t1.dbf' size 64k);
如果在这个数据文件中有多个空闲extent,或者计算的分配空间很大,也可以使用这个循环:
begin
for i in 1..1000000 loop
execute immediate 'alter table t1.s allocate extent (datafile '||'''/u01/app/oracle/oradata/orcl10/t1.dbf''' ||'size 64k) ';
end loop;
end ;
/
使用这个循环也有一个好处,第9步中的查询会很慢(测试环境可能几秒钟跑完了,实际环境中跑了接近2小时),使用for 循环可以避免多次查询,由于datafile自动扩展关了,受限于datafile 32g并不会造成太多数据。
9、确认坏块成为新建表的一部分
select segment_name, segment_type, owner
from dba_extents
where file_id = 5
and 1280 between block_id
and block_id blocks -1 ;
segment_name segment_type owner
------------------------------ ------------------ ------------------------------
s table t1
10、向表中插入数据格式化块
begin
for i in 1..1000000000 loop
insert /* append */ into t1.s select i, lpad('reformat',3092, 'r') from dual;
commit ;
end loop;
end;
每向表中插入一行就会触发触发器,当向坏块中插入第一行数据的时候,会产生ora-2000异常 。
error at line 1:
ora-20000: corrupt block has been formatted
ora-06512: at “sys.corrupt_trigger”, line 10
ora-04088: error during execution of trigger ‘sys.corrupt_trigger’
ora-06512: at line 4
11、 重新验证
直接dbv不行,要先使用rman在用dbv就正常了
- rman
rman> backup validate check logical datafile 5;
starting backup at 24-nov-20
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=159 devtype=disk
channel ora_disk_1: starting full datafile backupset
channel ora_disk_1: specifying datafile(s) in backupset
input datafile fno=00005 name=/u01/app/oracle/oradata/orcl10/t1.dbf
channel ora_disk_1: backup set complete, elapsed time: 00:00:02
finished backup at 24-nov-20
sql> select * from v$database_block_corruption;
no rows selected
rman> backup datafile 5;
starting backup at 24-nov-20
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=145 devtype=disk
channel ora_disk_1: starting full datafile backupset
channel ora_disk_1: specifying datafile(s) in backupset
input datafile fno=00005 name=/u01/app/oracle/oradata/orcl10/t1.dbf
channel ora_disk_1: starting piece 1 at 24-nov-20
channel ora_disk_1: finished piece 1 at 24-nov-20
piece handle=/u01/app/oracle/product/10.2.0/db_1/dbs/05vganpf_1_1 tag=tag20201124t110415 comment=none
channel ora_disk_1: backup set complete, elapsed time: 00:00:01
finished backup at 24-nov-20
- dbv
[oracle@rhel6 ~]$ dbv file='/u01/app/oracle/oradata/orcl10/t1.dbf'
dbverify: release 10.2.0.4.0 - production on tue nov 24 11:03:53 2020
米乐app官网下载 copyright (c) 1982, 2007, oracle. all rights reserved.
dbverify - verification starting : file = /u01/app/oracle/oradata/orcl10/t1.dbf
dbverify - verification complete
total pages examined : 7680
total pages processed (data) : 1147
total pages failing (data) : 0
total pages processed (index): 0
total pages failing (index): 0
total pages processed (other): 134
total pages processed (seg) : 0
total pages failing (seg) : 0
total pages empty : 6399
total pages marked corrupt : 0
total pages influx : 0
highest block scn : 631762 (0.631762)
12、清理
drop table,切换日志和checkpoint后删除触发器
drop table scott.s;
alter system switch logfile;
alter system switch logfile;
alter system checkpoint;
drop trigger corrupt_trigger ;