备库数据文件异常,物理DG如何恢复?
Tags: DGOracle丢失归档故障处理数据文件物理DG
有的时候由于备库空间不足,在主库添加了数据文件后,导致备库数据文件的缺失,可能很久之后才发现,但是由于归档的缺失等其它原因而导致备库不能正常应用Redo日志。还有其它情况可能导致备库的数据文件不能正常ONLINE,在这种情况下,可以在主库上利用CONVERT命令备份一个数据文件然后拷贝到备库即可。若是备库归档文件比较全,则可以直接在备库创建数据文件后应用Redo日志即可,而不需要从主库拷贝数据文件。
恢复过程中的一些关键性的命令如下所示:
1 2 3 4 5 6 | CONVERT DATAFILE '+DATA1/oralhrs/datafile/tbs101.262.923139373' FORMAT '/tmp/tbs101.dbf_bk';--主库备份相关文件 CONVERT DATAFILE '/tmp/tbs101.dbf_bk' FORMAT '+DATA1';--备库修改从主库拷贝过来的文件为ASM格式 ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='MANUAL' SID='*'; --备库修改文件管理模式为手动 ALTER DATABASE CREATE DATAFILE 64 AS '+DATA1';--备库若数据文件丢失可以先创建一个数据文件 ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215'; --重命名刚新建的数据文件为从主库拷贝过来的数据文件 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;--启用Redo恢复 |
接下来演示整个恢复过程。
首先查看备库的文件情况,发现64号文件处于OFFLINE状态。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64); FILE# NAME REC CHECKPOINT_CHANGE# STATUS ---------- ---------------------------------------------------- --- ------------------ ------- 1 +DATA1/oralhrsg/datafile/system.358.869055401 1.5760E+10 ONLINE 2 +DATA1/oralhrsg/datafile/sysaux.354.869047985 1.5760E+10 ONLINE 64 +DATA1/oralhrsg/datafile/tbs101.382.875442343 1764555149 OFFLINE SYS@oraLHRDG2> RECOVER DATAFILE 64; ORA-00283: recovery session canceled due to errors ORA-01153: an incompatible media recovery is active SYS@oraLHRDG2> RECOVER MANAGED STANDBY DATABASE CANCEL; Media recovery complete. SYS@oraLHRDG2> RECOVER DATAFILE 64; ORA-00283: recovery session canceled due to errors ORA-01610: recovery using the BACKUP CONTROLFILE option must be done SYS@oraLHRDG2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION; Database altered. SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE; alter database datafile 64 online * ERROR at line 1: ORA-01113: file 64 needs media recovery ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.875442343' |
虽然可以开启实时应用进程,但是64号文件依然不能ONLINE,因为现在的系统SCN号和64号文件头的SCN号相差很大了,归档日志必然不存在了,所以使用日志来恢复文件的方法自然不可行了。那么,接下来在主库用CONVERT命令备份64号文件:
1 2 3 4 5 6 7 8 9 10 11 12 | [ZFLHRSDB1:oracle]:/oracle>rman target / Recovery Manager: Release 11.2.0.3.0 - Production on Wed Sep 21 14:49:56 2016 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. connected to target database: ORAIPPS (DBID=1344172889) RMAN> CONVERT DATAFILE '+DATA1/oralhrs/datafile/tbs101.262.923139373' FORMAT '/tmp/tbs101.dbf_bk'; Starting conversion at target at 2016-09-21 14:51:16 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile conversion input file name=+DATA1/oralhrs/datafile/tbs101.262.923139373 converted datafile=/tmp/tbs101.dbf_bk channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:03 Finished conversion at target at 2016-09-21 14:51:19 |
将备份的文件拷贝到备库:
1 2 3 4 5 6 | [ZFLHRSDB1:oracle]:/tmp>scp /tmp/tbs101.dbf_bk oracle@192.68.155.16:/tmp/tbs101.dbf_bk The authenticity of host '192.68.155.16 (192.68.155.16)' can't be established. RSA key fingerprint is 7b:d6:ba:ca:b3:71:b5:0b:bf:14:f4:e4:18:5f:51:45. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.68.155.16' (RSA) to the list of known hosts. tbs101.dbf_bk 100% 100MB 50.0MB/s 00:02 |
在备库上转换文件为ASM格式:
1 2 3 4 5 6 7 8 9 10 | RMAN> CONVERT DATAFILE '/tmp/tbs101.dbf_bk' FORMAT '+DATA1'; Starting conversion at target at 2016-09-21 14:53:33 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=1542 instance=oraLHRDG2 device type=DISK channel ORA_DISK_1: starting datafile conversion input file name=/tmp/tbs101.dbf_bk converted datafile=+DATA1/oralhrsg/datafile/tbs101.382.923151215 channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01 Finished conversion at target at 2016-09-21 14:53:36 |
备库上进行重命名操作,若是备库上64号文件被删除了,则需要先重建64号文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | SYS@oraLHRDG2> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='MANUAL' SID='*'; System altered. SYS@oraLHRDG2> ALTER DATABASE CREATE DATAFILE 64 AS '+DATA1'; Database altered. SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64); FILE# NAME REC CHECKPOINT_CHANGE# STATUS ---------- ------------------------------------------------ --- ------------------ ------- 1 +DATA1/oralhrsg/datafile/system.358.869055401 1.5761E+10 ONLINE 2 +DATA1/oralhrsg/datafile/sysaux.354.869047985 1.5761E+10 ONLINE 64 +DATA1/oralhrsg/datafile/tbs101.483.923151901 1.5761E+10 OFFLINE SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE; ALTER DATABASE DATAFILE 64 ONLINE * ERROR at line 1: ORA-01113: file 64 needs media recovery ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.483.923151901' |
可以看到已经有64号文件了,下边进行重命名,修改为从主库拷贝过来的64号文件:
1 2 3 4 5 6 7 8 9 10 11 12 | SYS@oraLHRDG2> ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215'; ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215' * ERROR at line 1: ORA-01511: error in renaming log/data files ORA-01121: cannot rename database file 64 - file is in use or recovery ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.483.923151901' SYS@oraLHRDG2> ! oerr ora 01121 01121, 00000, "cannot rename database file %s - file is in use or recovery" // *Cause: Attempted to use ALTER DATABASE RENAME to rename a // datafile that is online in an open instance or is being recovered. // *Action: Close database in all instances and end all recovery sessions. |
该错误提示文件在使用,不能被重命名。由于该库是RAC库,需要先关闭DG,启动到MOUNT状态后再重命名:
1 2 | [ZFLHRSDB4:root]:/>srvctl stop db -d oralhrsg [ZFLHRSDB4:root]:/>srvctl start db -d oralhrsg -o mount |
在数据库中重命名:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | SYS@oraLHRDG2> conn / as sysdba Connected. SYS@oraLHRDG2> ALTER DATABASE RENAME FILE '+DATA1/oralhrsg/datafile/tbs101.483.923151901' TO '+DATA1/oralhrsg/datafile/tbs101.382.923151215'; Database altered. SYS@oraLHRDG2> ALTER DATABASE DATAFILE 64 ONLINE; Database altered.<<<<<<<<<---------数据文件可以ONLINE了 SYS@oraLHRDG2> COL NAME FOR A50 SYS@oraLHRDG2> COL CHECKPOINT_CHANGE# FOR 9999999999999 SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64); FILE# NAME REC CHECKPOINT_CHANGE# STATUS ---------- -------------------------------------------------- --- ------------------ ------- 1 +DATA1/oralhrsg/datafile/system.358.869055401 15760776695 ONLINE 2 +DATA1/oralhrsg/datafile/sysaux.354.869047985 15760776695 ONLINE 64 +DATA1/oralhrsg/datafile/tbs101.382.923151215 15760492416 ONLINE SYS@oraLHRDG2> ALTER DATABASE OPEN READ ONLY; ALTER DATABASE OPEN READ ONLY * ERROR at line 1: ORA-10458: standby database requires recovery ORA-01194: file 64 needs more recovery to be consistent ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.923151215' <<<<<<<<<------------ 打开数据库依然报错,尝试手动恢复一下,看看需要哪些日志,因为64号文件已经是最新的了 SYS@oraLHRDG2> RECOVER DATABASE; ORA-00283: recovery session canceled due to errors ORA-01610: recovery using the BACKUP CONTROLFILE option must be done SYS@oraLHRDG2> RECOVER STANDBY DATABASE USING BACKUP CONTROLFILE; ORA-00279: change 15760492416 generated at 09/21/2016 11:38:54 needed for thread 1 ORA-00289: suggestion : /arch/1_12918_868895513.arc ORA-00280: change 15760492416 for thread 1 is in sequence #12918 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 64 needs more recovery to be consistent ORA-01110: data file 64: '+DATA1/oralhrsg/datafile/tbs101.382.923151215' ORA-01112: media recovery not started <<<<<<<<<------------ 缺少12918日志,很欣慰,因为12918已经是最新的日志了,这里解决起来就很简单了,可以从主库拷贝12918日志到备库,但是这样太麻烦,可以开启备库的应用进程让其自动解决备库的GAP问题 SYS@oraLHRDG2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION; Database altered. |
此时查看告警日志,很欣慰看到了12918日志过来了:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | Wed Sep 21 15:24:33 2016 alter database recover managed standby database using current logfile disconnect from session Attempt to start background Managed Standby Recovery process (oraLHRDG2) Wed Sep 21 15:24:33 2016 MRP0 started with pid=44, OS id=12649040 MRP0: Background Managed Standby Recovery process started (oraLHRDG2) started logmerger process Wed Sep 21 15:24:39 2016 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 16 slaves Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Wed Sep 21 15:24:40 2016 Media Recovery Log /arch/1_12918_868895513.arc Media Recovery Log /arch/2_12918_868895513.arc Completed: alter database recover managed standby database using current logfile disconnect from session Datafile 64 added to flashback set Media Recovery Log /arch/2_12919_868895513.arc Media Recovery Log /arch/1_12919_868895513.arc Media Recovery Log /arch/2_12920_868895513.arc Media Recovery Log /arch/1_12920_868895513.arc Media Recovery Log /arch/2_12921_868895513.arc Media Recovery Log /arch/1_12921_868895513.arc Media Recovery Waiting for thread 2 sequence 12922 (in transit) Recovery of Online Redo Log: Thread 2 Group 12 Seq 12922 Reading mem 0 Mem# 0: +DATA1/oralhrsg/onlinelog/group_12.353.869055809 Media Recovery Waiting for thread 1 sequence 12922 (in transit) Recovery of Online Redo Log: Thread 1 Group 8 Seq 12922 Reading mem 0 Mem# 0: +DATA1/oralhrsg/onlinelog/group_8.344.869055791 |
最后重启备库的2个节点:
1 2 3 4 5 | [ZFLHRSDB4:root]:/>srvctl stop db -d oralhrsg [ZFLHRSDB4:root]:/>srvctl start db -d oralhrsg [ZFLHRSDB4:root]:/>srvctl status db -d oralhrsg Instance oraLHRDG1 is running on node zflhrsdb3 Instance oraLHRDG2 is running on node zflhrsdb4 |
而数据库中64号文件已经正常了:
1 2 3 4 5 6 7 8 9 10 11 12 13 | SYS@oraLHRDG2> SELECT A.FILE#,A.NAME,A.RECOVER,A.CHECKPOINT_CHANGE#,STATUS FROM V$DATAFILE_HEADER A WHERE A.FILE# IN (1,2,64); FILE# NAME REC CHECKPOINT_CHANGE# STATUS ---------- -------------------------------------------------- --- ------------------ ------- 1 +DATA1/oralhrsg/datafile/system.358.869055401 15760815694 ONLINE 2 +DATA1/oralhrsg/datafile/sysaux.354.869047985 15760815694 ONLINE 64 +DATA1/oralhrsg/datafile/tbs101.382.923151215 15760815694 ONLINE SYS@oraLHRDG2> show parameter standby NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_archive_dest string ?/dbs/arch standby_file_management string MANUAL SYS@oraLHRDG2> ALTER SYSTEM SET standby_file_management='AUTO' SID='*';====>>>>> 别忘记将该参数修改回来 System altered. |
最后不要忘记将STANDBY_FILE_MANAGEMENT参数修改为AUTO。