最近遇到一个需求,公司有套netapp存储已过保一段时间,目前作为测试环境使用,但最近这套存储出现故障,一条链路状态异常。导致其上配置的多路径有两条链路非active状态,系统组担心会影响到数据库的使用。最近刚好有套过保的3par存储,在配置和性能方面都要优于这套过保的netapp存储,希望我配合进行存储的替换。
目前这套netapp存储,数据库组目前在其上部署的有4套环境,其中有三套都是单机在使用,可以方便替换,其中有套oracle 19c rac测试库,虽然是测试库,但当前生产对应的很多测试都是在其上进行验证测试,非常重要,需要提前和项目组沟通,并要避免数据的丢失和损坏。
之前我曾验证过emc存储和3par存储的替换,因为emc日常使用的是emc自带的存储,3par存储通常使用操作系统自带multipah多路径软件,在同一台服务器上安装时会有不兼容的情况。
但目前这套netapp使用的multipah多路径软件,和3par还是可以兼容的。
为防止在替换过程中会出现一些意外,我这边还是提前做了一些相关备份操作,将多路径软及ocr信息进行了手工备份,并将数据库数据进行了备份。
并通知了项目组,计划在下班后进行操作。
这套19c rac库操作系统为centos 7.9,当前存储使用3个多t。
[root@local-test-xxxx~]# cat /etc/redhat-release
centos linux release 7.9.2009 (core)
[root@local-test-mlandb1 ~]# su - grid
last login: mon aug 28 13:29:37 cst 2023
[grid@local-test-xxxx~]$ ./asm/asm
curdate
-------------------
2023-08-28 13:44:07
disk group sector block allocation
name size size unit size state type total size (mb) used size (mb) free_mb pct. used
---------------- ------- ------- ------------ ----------- ------ --------------- -------------- ---------- ---------
crs 512 4,096 4,194,304 mounted normal 30,672 1,132 29540 3.69
dg 512 4,096 4,194,304 mounted extern 3,583,968 3,188,400 395568 88.96
--------------- --------------
grand total: 3,614,640 3,189,532
asm使用的是oracle asmlib包安装的asm磁盘。
[root@local-test-xxxx~]# oracleasm -v
oracleasm version 2.1.11
[root@local-test-xxxx~]# oracleasm listdisks
vol01
vol02
vol03
vol04
vol05
vol06
vol07
ocr信息如下:
[grid@local-test-xxxx ~]$ ocrcheck
status of oracle cluster registry is as follows :
version : 4
total space (kbytes) : 491684
used space (kbytes) : 84332
available space (kbytes) : 407352
id : 231456364
device/file name : crs
device/file integrity check succeeded
device/file not configured
device/file not configured
device/file not configured
device/file not configured
cluster registry integrity check succeeded
logical corruption check bypassed due to non-privileged user
sql> select group_number,name,type,state,total_mb/1024 total_gb,free_mb/1024 free_gb,round((free_mb/total_mb)*100,2)||'%' pct_free from v$asm_diskgroup;
1 crs normal mounted 23.9882813 22.9648438 95.73%
2 dg extern mounted 7000.41016 3983.82813 56.91%
asm磁盘信息如下
sql> select group_number,free_mb,total_mb,failgroup,disk_number,mount_status,mode_status,state,header_status,name,path from v$asm_disk order by 4,5;
1 6844 10224 crs_0001 3 cached online normal member crs_0001 /dev/oracleasm/disks/vol01
1 6848 10224 crs_0002 4 cached online normal member crs_0002 /dev/oracleasm/disks/vol02
1 6848 10224 crs_0003 5 cached online normal member crs_0003 /dev/oracleasm/disks/vol03
2 291368 512076 dg_00004 0 cached online normal member dg_0000 /dev/oracleasm/disks/vol04
2 291368 512076 dg_00005 1 cached online normal member dg_0001 /dev/oracleasm/disks/vol05
2 874144 1536192 dg_00006 3 cached online normal member dg_0003 /dev/oracleasm/disks/vol06
2 582744 1024108 dg_00007 2 cached online normal member dg_00007 /dev/oracleasm/disks/vol07
7 rows selected.
**多路径信息备份: **
[root@local-test-xxxx~]# multipath -ll > /root/multipath_xxx.txt
数据库备份,使用expdp对数据库进行导出备份
ocr备份:
-- root用户
[root@local-test-mlandbxxx ~]# /u01/app/19.3.0/grid/bin/ocrconfig -export /tmp/ocr_20230808.dmp
然后通知系统管理员为这套rac 分配3par存储。为何原存储进行区分,新分配的存储没块大小都和之前不同,便于区分,存储分配后,先逐台重启了两台数据库服务器识别到新加的存储。
然后使用dd if=/dev/zero of=/dev/mapper/xxx bs=1024k count=10000 对新加的存储进行格式化,然后进行fdisk分区。
之后通过如下oracleasm命令创建新的asm磁盘,新创建的asm磁盘名称和原磁盘名称区分开来。
[root@local-test-xxxx~]# oracleasm createdisk vol1 /dev/mapper/mpathi1
[root@local-test-xxxx~]# oracleasm createdisk vol2 /dev/mapper/mpathj1
[root@local-test-xxxx~]# oracleasm createdisk vol3 /dev/mapper/mpathk1
[root@local-test-xxxx~]# oracleasm createdisk vol4 /dev/mapper/mpathl1
[root@local-test-xxxx~]# oracleasm createdisk vol5 /dev/mapper/mpathm1
---然后再二个节点扫盘识别到新加的asm存储盘。
[root@local-test-xxxx2 ~]# oracleasm scandisks
对于数据盘替换,这块之前已经做了好多次,心里有把握,现将新的asm磁盘通过如下方式添佳到数据磁盘组:
su - grid
sqlplus /as sysasm
sql> alter diskgroup dg add disk '/dev/oracleasm/disks/vol4' rebalance power 11;
sql> alter diskgroup dg add disk '/dev/oracleasm/disks/vol5' rebalance power 11;
sql> sql> select * from v$asm_operation;
2 rebal compact wait 11 11 0 0 0 0 0
2 rebal rebalance run 11 11 5962 71713 3900 16 0
2 rebal rebuild done 11 11 0 0 0 0
替换ocr盘,之前替换10g rac ocr盘的时候比较麻烦,需要修改的配置较多,但替换19c ocr盘还是第一次,心里也是没有把握,于是查询了网上一些资料,看到jikexu曾写过 11g https://cloud.tencent.com/developer/article/1696479 ocr替换,不知道19c是不是也可以,并向其进行了请教。
sql> alter diskgroup crs add disk '/dev/oracleasm/disks/vol1';
alter diskgroup crs add disk '/dev/oracleasm/disks/vol1'
*
error at line 1:
ora-15032: not all alterations performed
ora-15410: disks in disk group crs do not have equal size.
因为新的ocr盘和原盘大小不同,无法直接添加,恩墨彗星性老师给我一个他曾写的文章:https://www.modb.pro/db/543611 对于ora-15410 disks in disk group ocr_vot do not have equal size 有相应的解释和处理办法。我本次没有去尝试。
于是决定采用jiekexu提到的方法将全部ocr添加进去并替换掉原有的ocr盘。
su - grid
sqlplus / as sysasm
sql> alter diskgroup crs add disk '/dev/oracleasm/disks/vol1','/dev/oracleasm/disks/vol2','/dev/oracleasm/disks/vol3' drop disk 'crs_0000','crs_0001','crs_0002';
diskgroup altered.
sql> select group_number,free_mb,total_mb,failgroup,disk_number,mount_status,mode_status,state,header_status,name,path from v$asm_disk order by 4,5;
1 9844 10224 crs_0003 3 cached online normal member crs_0003 /dev/oracleasm/disks/vol1
1 9848 10224 crs_0004 4 cached online normal member crs_0004 /dev/oracleasm/disks/vol2
1 9848 10224 crs_0005 5 cached online normal member crs_0005 /dev/oracleasm/disks/vol3
2 291368 512076 dg_0000 0 cached online normal member dg_0000 /dev/oracleasm/disks/vol04
2 291368 512076 dg_0001 1 cached online normal member dg_0001 /dev/oracleasm/disks/vol05
2 874144 1536192 dg_0003 3 cached online normal member dg_0003 /dev/oracleasm/disks/vol07
2 1107116 1945584 dg_0004 4 cached online normal member dg_0004 /dev/oracleasm/disks/vol4
2 932304 1638384 dg_0005 5 cached online normal member dg_0005 /dev/oracleasm/disks/vol5
2 582744 1024108 vol06 2 cached online normal member vol06 /dev/oracleasm/disks/vol06
0 0 0 0 closed online normal former /dev/oracleasm/disks/vol02
0 0 0 1 closed online normal former /dev/oracleasm/disks/vol01
0 0 0 2 closed online normal former /dev/oracleasm/disks/vol03
12 rows selected.
原来三块ocr盘确实都被替换了。
4.1 替换ocr盘日志信息
以下是ocr盘替换时查看到的asm日志信息
替换ocr盘的日志信息如下:
2023-08-08t16:00:10.337765 08:00
sql> alter diskgroup crs add disk '/dev/oracleasm/disks/vol1','/dev/oracleasm/disks/vol2','/dev/oracleasm/disks/vol3' drop disk 'crs_0000','crs_0001','crs_0002'
2023-08-08t16:00:10.339009 08:00
note: groupblock outside rolling migration privileged region
note: assigning number (1,3) to disk (/dev/oracleasm/disks/vol1)
note: assigning number (1,4) to disk (/dev/oracleasm/disks/vol2)
note: assigning number (1,5) to disk (/dev/oracleasm/disks/vol3)
note: requesting all-instance membership refresh for group=1
2023-08-08t16:00:11.457532 08:00
note: attempting voting file relocation on diskgroup crs
warning: read failed. group:1 disk:3 au:0 offset:0 size:4096
path:unknown disk
incarnation:0xf0f0ed7e asynchronous result:'i/o error' ioreason:2818 why:11
subsys:unknown library krq:0x7f5e6fbe5938 bufp:0x7f5e75183000 osderr1:0x434c5344 osderr2:0x0
io elapsed time: 0 usec time waited on i/o: 0 usec
warning: read failed. group:1 disk:4 au:0 offset:0 size:4096
path:unknown disk
incarnation:0xf0f0ed7f asynchronous result:'i/o error' ioreason:2818 why:11
subsys:unknown library krq:0x7f5e6fbe5490 bufp:0x7f5e75181000 osderr1:0x434c5344 osderr2:0x0
io elapsed time: 0 usec time waited on i/o: 0 usec
warning: read failed. group:1 disk:5 au:0 offset:0 size:4096
path:unknown disk
incarnation:0xf0f0ed80 asynchronous result:'i/o error' ioreason:2818 why:11
subsys:unknown library krq:0x7f5e751809e0 bufp:0x7f5e6fc13000 osderr1:0x434c5344 osderr2:0x0
io elapsed time: 0 usec time waited on i/o: 0 usec
note: successful voting file relocation on diskgroup crs
2023-08-08t16:00:12.534839 08:00
note: disk 3 in group 1 is assigned fgnum=4
note: disk 4 in group 1 is assigned fgnum=5
note: disk 5 in group 1 is assigned fgnum=6
note: discarding redo for group 1 disk 3
note: discarding redo for group 1 disk 4
note: discarding redo for group 1 disk 5
note: initializing header (replicated) on grp 1 disk crs_0003
note: initializing header (replicated) on grp 1 disk crs_0004
note: initializing header (replicated) on grp 1 disk crs_0005
note: initializing header on grp 1 disk crs_0003
note: initializing header on grp 1 disk crs_0004
note: initializing header on grp 1 disk crs_0005
note: requesting all-instance disk validation for group=1
2023-08-08t16:00:12.605238 08:00
note: skipping rediscovery for group 1/0xd4a01dae (crs) on local instance.
2023-08-08t16:00:12.727464 08:00
note: running client discovery for group 1 (reqid:15321278565476053588)
note: requesting all-instance disk validation for group=1
2023-08-08t16:00:12.731123 08:00
note: skipping rediscovery for group 1/0xd4a01dae (crs) on local instance.
2023-08-08t16:00:12.829109 08:00
note: running client discovery for group 1 (reqid:15321278565476065544)
note: allocated 1 dd reserve extents for group 1 (crs)
note: allocated 6 vat reserve extents for group 1 (crs).
note: adding disk 3 (crs_0003) to grp 1 (crs) (2556 aus)
2023-08-08t16:00:13.804894 08:00
note: adding disk 4 (crs_0004) to grp 1 (crs) (2556 aus)
2023-08-08t16:00:14.387378 08:00
note: adding disk 5 (crs_0005) to grp 1 (crs) (2556 aus)
2023-08-08t16:00:14.992385 08:00
gmon updating for reconfiguration, group 1 at 21 for pid 34, osid 30275
2023-08-08t16:00:15.003705 08:00
note: group 1 pst updated.
2023-08-08t16:00:15.068630 08:00
note: membership refresh pending for group 1/0xd4a01dae (crs)
note: attempting voting file refresh on diskgroup crs
note: refresh completed on diskgroup crs. found 3 voting file(s).
note: voting file relocation is required in diskgroup crs
2023-08-08t16:00:15.129060 08:00
gmon querying group 1 at 22 for pid 27, osid 6491
note: cache opening disk 3 of grp 1: crs_0003 path:/dev/oracleasm/disks/vol1
note: cache opening disk 4 of grp 1: crs_0004 path:/dev/oracleasm/disks/vol2
note: cache opening disk 5 of grp 1: crs_0005 path:/dev/oracleasm/disks/vol3
2023-08-08t16:00:15.552478 08:00
note: attempting voting file refresh on diskgroup crs
note: refresh completed on diskgroup crs. found 3 voting file(s).
note: voting file relocation is required in diskgroup crs
note: attempting voting file relocation on diskgroup crs
note: successful voting file relocation on diskgroup crs
2023-08-08t16:00:15.616116 08:00
gmon querying group 1 at 23 for pid 27, osid 6491
2023-08-08t16:00:15.657735 08:00
success: refreshed membership for 1/0xd4a01dae (crs)
2023-08-08t16:00:15.658273 08:00
success: alter diskgroup crs add disk '/dev/oracleasm/disks/vol1','/dev/oracleasm/disks/vol2','/dev/oracleasm/disks/vol3' drop disk 'crs_0000','crs_0001','crs_0002'
2023-08-08t16:00:15.659229 08:00
note: starting rebalance of group 1/0xd4a01dae (crs) at power 1
note: starting process arba
starting background process arba
2023-08-08t16:00:15.706992 08:00
arba started with pid=46, os id=48198
note: starting process arb0
starting background process arb0
2023-08-08t16:00:15.729429 08:00
arb0 started with pid=47, os id=48200
note: assigning arba to group 1/0xd4a01dae (crs) to compute estimates
note: assigning arb0 to group 1/0xd4a01dae (crs) with 1 parallel i/o
2023-08-08t16:00:16.056816 08:00
note: header on disk 0 advanced to format #2 using fcn 0.0
note: f1x0 on disk 0 (fmt 2) relocated at fcn 0.149388: au 10 -> au 0
note: header on disk 1 advanced to format #2 using fcn 0.0
note: f1x0 on disk 1 (fmt 2) relocated at fcn 0.149388: au 10 -> au 0
note: header on disk 2 advanced to format #2 using fcn 0.0
note: f1x0 on disk 2 (fmt 2) relocated at fcn 0.149388: au 10 -> au 0
note: header on disk 3 advanced to format #2 using fcn 0.0
note: f1x0 on disk 3 (fmt 2) relocated at fcn 0.149388: au 0 -> au 10
note: header on disk 4 advanced to format #2 using fcn 0.0
note: f1x0 on disk 4 (fmt 2) relocated at fcn 0.149388: au 0 -> au 10
note: header on disk 5 advanced to format #2 using fcn 0.0
note: f1x0 on disk 5 (fmt 2) relocated at fcn 0.149388: au 0 -> au 10
note: 08/08/23 16:00:15 crs.f1x0 copy 1 relocating from 0:10 to 4:10 at fcn 0.149388
note: 08/08/23 16:00:15 crs.f1x0 copy 2 relocating from 1:10 to 3:10 at fcn 0.149388
note: 08/08/23 16:00:15 crs.f1x0 copy 3 relocating from 2:10 to 5:10 at fcn 0.149388
2023-08-08t16:00:18.760349 08:00
note: attempting voting file refresh on diskgroup crs
note: refresh completed on diskgroup crs. found 3 voting file(s).
note: voting file relocation is required in diskgroup crs
note: attempting voting file relocation on diskgroup crs
note: voting file allocation (replicated) on grp 1 disk crs_0003
note: voting file allocation on grp 1 disk crs_0003
note: voting file allocation (replicated) on grp 1 disk crs_0004
note: voting file allocation on grp 1 disk crs_0004
note: voting file allocation (replicated) on grp 1 disk crs_0005
note: voting file allocation on grp 1 disk crs_0005
note: voting file deletion (replicated) on grp 1 disk crs_0000
note: voting file deletion on grp 1 disk crs_0000
note: voting file deletion (replicated) on grp 1 disk crs_0001
note: voting file deletion on grp 1 disk crs_0001
note: voting file deletion (replicated) on grp 1 disk crs_0002
note: voting file deletion on grp 1 disk crs_0002
note: successful voting file relocation on diskgroup crs
2023-08-08t16:00:21.170262 08:00
note: stopping process arb0
note: stopping process arba
note: starting expel slave for group 1/0xd4a01dae (crs)
2023-08-08t16:00:21.201736 08:00
note: groupblock outside rolling migration privileged region
note: requesting all-instance membership refresh for group=1
2023-08-08t16:00:21.304682 08:00
gmon updating for reconfiguration, group 1 at 24 for pid 49, osid 48289
2023-08-08t16:00:21.308250 08:00
note: group 1 pst updated.
success: grp 1 disk crs_0000 emptied
success: grp 1 disk crs_0001 emptied
success: grp 1 disk crs_0002 emptied
note: process _x000_ asm1 (48289) initiating offline of disk 0.4042321254 (crs_0000) with mask 0x7e in group 1 (crs) without client assisting
note: process _x000_ asm1 (48289) initiating offline of disk 1.4042321255 (crs_0001) with mask 0x7e in group 1 (crs) without client assisting
note: process _x000_ asm1 (48289) initiating offline of disk 2.4042321253 (crs_0002) with mask 0x7e in group 1 (crs) without client assisting
note: initiating pst update: grp 1 (crs), dsk = 0/0xf0f0ed66, mask = 0x6a, op = clear mandatory
note: initiating pst update: grp 1 (crs), dsk = 1/0xf0f0ed67, mask = 0x6a, op = clear mandatory
note: initiating pst update: grp 1 (crs), dsk = 2/0xf0f0ed65, mask = 0x6a, op = clear mandatory
2023-08-08t16:00:21.335368 08:00
gmon updating disk modes for group 1 at 25 for pid 49, osid 48289
note: group crs: updated pst location: disks 0003 0004 0005
2023-08-08t16:00:21.342612 08:00
note: pst update grp = 1 completed successfully
note: initiating pst update: grp 1 (crs), dsk = 0/0xf0f0ed66, mask = 0x7e, op = clear mandatory
note: initiating pst update: grp 1 (crs), dsk = 1/0xf0f0ed67, mask = 0x7e, op = clear mandatory
note: initiating pst update: grp 1 (crs), dsk = 2/0xf0f0ed65, mask = 0x7e, op = clear mandatory
2023-08-08t16:00:21.343143 08:00
gmon updating disk modes for group 1 at 26 for pid 49, osid 48289
2023-08-08t16:00:21.345868 08:00
note: cache closing disk 0 of grp 1: crs_0000
2023-08-08t16:00:21.349547 08:00
note: cache closing disk 1 of grp 1: crs_0001
2023-08-08t16:00:21.349885 08:00
note: cache closing disk 2 of grp 1: crs_0002
2023-08-08t16:00:21.351878 08:00
note: pst update grp = 1 completed successfully
note: expelling disk 0 (crs_0000) from grp 1 (crs)
note: expelling disk 1 (crs_0001) from grp 1 (crs)
note: expelling disk 2 (crs_0002) from grp 1 (crs)
2023-08-08t16:00:21.354440 08:00
gmon updating for reconfiguration, group 1 at 27 for pid 49, osid 48289
2023-08-08t16:00:21.354951 08:00
note: cache closing disk 0 of grp 1: (not open) crs_0000
2023-08-08t16:00:21.355051 08:00
note: cache closing disk 1 of grp 1: (not open) crs_0001
2023-08-08t16:00:21.355145 08:00
note: cache closing disk 2 of grp 1: (not open) crs_0002
2023-08-08t16:00:21.357594 08:00
note: group 1 pst updated.
note: grp 1 disk 0 expelled from the pst.
note: grp 1 disk 1 expelled from the pst.
note: grp 1 disk 2 expelled from the pst.
note: erasing header on /dev/oracleasm/disks/vol01
note: erasing header on /dev/oracleasm/disks/vol02
note: erasing header on /dev/oracleasm/disks/vol03
2023-08-08t16:00:21.625721 08:00
note: membership refresh pending for group 1/0xd4a01dae (crs)
note: attempting voting file refresh on diskgroup crs
note: refresh completed on diskgroup crs. found 3 voting file(s).
note: voting file relocation is required in diskgroup crs
2023-08-08t16:00:21.687195 08:00
gmon querying group 1 at 28 for pid 27, osid 6491
gmon querying group 1 at 29 for pid 27, osid 6491
2023-08-08t16:00:21.708706 08:00
note: disk crs_0000 in mode 0x0 marked for de-assignment
note: disk crs_0001 in mode 0x0 marked for de-assignment
note: disk crs_0002 in mode 0x0 marked for de-assignment
success: refreshed membership for 1/0xd4a01dae (crs)
2023-08-08t16:00:22.580833 08:00
success: rebalance completed for group 1/0xd4a01dae (crs)
note: attempting voting file refresh on diskgroup crs
note: refresh completed on diskgroup crs. found 3 voting file(s).
note: voting file relocation is required in diskgroup crs
note: attempting voting file relocation on diskgroup crs
note: successful voting file relocation on diskgroup crs
4.2 剔除数据盘日志信息
2023-08-08t16:15:21.792257 08:00
note: requesting all-instance membership refresh for group=2
2023-08-08t16:15:21.820517 08:00
note: membership refresh pending for group 2/0xd4c01daf (dg)
2023-08-08t16:15:21.822852 08:00
gmon querying group 2 at 30 for pid 27, osid 6491
2023-08-08t16:15:21.844719 08:00
success: refreshed membership for 2/0xd4c01daf (dg)
2023-08-08t16:15:21.845335 08:00
success: alter diskgroup dg drop disk dg_0000 rebalance power 11 nowait
2023-08-08t16:15:23.305285 08:00
note: attempting voting file refresh on diskgroup dg
note: refresh completed on diskgroup dg. no voting file found.
2023-08-08t16:15:23.328193 08:00
note: starting rebalance of group 2/0xd4c01daf (dg) at power 11
note: starting process arba
starting background process arba
2023-08-08t16:15:23.377695 08:00
arba started with pid=39, os id=10549
note: starting process arb0
starting background process arb0
2023-08-08t16:15:23.403147 08:00
arb0 started with pid=47, os id=10551
note: assigning arba to group 2/0xd4c01daf (dg) to compute estimates
note: assigning arb0 to group 2/0xd4c01daf (dg) with 11 parallel i/os
2023-08-08t16:15:23.603553 08:00
note: f1x0 on disk 0 (fmt 2) relocated at fcn 0.14556462: au 10 -> au 0
note: header on disk 4 advanced to format #2 using fcn 0.0
note: f1x0 on disk 4 (fmt 2) relocated at fcn 0.14556462: au 0 -> au 209615
note: 08/08/23 16:15:23 dg.f1x0 copy 1 relocating from 0:10 to 4:209615 at fcn 0.14556462