rman在rac或ASM环境做增量恢复时报错RMAN-06094 datafile 1 must be restored
Tags: OraclermanRMAN-06094增量恢复故障处理
现象
rman先做了全量的resotore+recover操作,然后还原增量的控制文件,然后做增量的recover操作,结果报错了:
1 2 3 4 5 6 7 8 | RMAN> recover database; Starting recover at 27-FEB-22 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at 02/27/2022 17:44:52 RMAN-06094: datafile 1 must be restored |
报错的意思:文件1必须被还原,文件1不存在。
分析
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | [oracle@odsa ~]$ rman target / Recovery Manager: Release 12.2.0.1.0 - Production on Sun Feb 27 17:34:46 2022 Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights reserved. connected to target database: lhrdb (DBID=461586729, not open) RMAN> report schema; using target database control file instead of recovery catalog RMAN-06139: warning: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name lhrdb List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 0 SYSTEM *** +DATA/lhrdb/datafile/system.259.985609967 2 0 SYSAUX *** +DATA/lhrdb/datafile/sysaux.260.985609969 3 0 UNDOTBS1 *** +DATA/lhrdb/datafile/undotbs1.261.985609969 4 0 UNDOTBS2 *** +DATA/lhrdb/datafile/undotbs2.263.985609973 5 0 USERS *** +DATA/lhrdb/datafile/users.264.985609975 。。。。。 List of Temporary Files ======================= File Size(MB) Tablespace Maxsize(MB) Tempfile Name ---- -------- -------------------- ----------- -------------------- 1 20 TEMP 32767 +DATA/lhrdb/tempfile/temp.262.985609969 2 16384 TEMP 16384 +DATA/lhrdb/tempfile/temp.353.1096538341 RMAN> exit |
结果发现,“Size(MB)”列都是0。有点奇怪。
然后去asmcmd里查看是否存在文件:
1 2 3 4 5 6 | ASMCMD> ls -l +DATA/lhrdb/datafile/system.259.985609967 ASMCMD-8002: entry 'system.259.985609967' does not exist in directory '+DATA/lhrdb/datafile/' ASMCMD> cd +DATA/lhrdb/datafile/ ASMCMD> ls -l system* Type Redund Striped Time Sys Name DATAFILE UNPROT COARSE FEB 27 17:00:00 Y SYSTEM.329.1097770431 |
结果发现,文件名不一样。
于是找之前的全量的restore过程,发现还原的文件名为:
1 2 3 4 5 6 | channel ORA_DISK_6: starting datafile backup set restore channel ORA_DISK_6: specifying datafile(s) to restore from backup set channel ORA_DISK_6: restoring datafile 00001 to +DATA/lhrdb/datafile/system.259.985609967 channel ORA_DISK_6: restoring datafile 00014 to +DATA/lhrdb/datafile/ids.276.985620689 channel ORA_DISK_6: restoring datafile 00022 to +DATA/lhrdb/datafile/ods.284.985621183 channel ORA_DISK_6: restoring datafile 00038 to +DATA/lhrdb/datafile/ods.300.985621707 |
第一次全量restore的时候,终端显示还原后的文件名为system.259.985609967,但是,其实asmcmd中的文件名已经发生了变化,这一点在我删除环境后,第二次做实验的时候得到了验证,另外,可以在告警日志中查询到恢复后的数据文件名:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | RMAN> report schema; using target database control file instead of recovery catalog RMAN-06139: warning: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name lhrdb List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 4096 SYSTEM *** +DATA/lhrdb/DATAFILE/system.289.1097780323 2 0 SYSAUX *** +DATA/lhrdb/datafile/sysaux.260.985609969 3 30720 UNDOTBS1 *** +DATA/lhrdb/DATAFILE/undotbs1.338.1097780321 4 0 UNDOTBS2 *** +DATA/lhrdb/datafile/undotbs2.263.985609973 5 4958 USERS *** +DATA/lhrdb/DATAFILE/users.270.1097780323 |
由于,第一次的全量后的归档丢失,所以,若要做恢复操作,必须恢复最新的控制文件,但是最新的控制文件里记录的文件名依然是源库的控制文件,所以就导致了报错“RMAN-06094: datafile 1 must be restored”。
解决
所以,解决方法就是,在做增量恢复操作,恢复最新的控制文件之前,执行report schema操作或“select file#,name from v$datafile;”记录最新的文件名,然后恢复最新的控制文件,再通过rename重命名的方式将控制文件中的名称修改为最新的数据文件名。
这个问题和单机还不太一样,单机的参考:https://www.xmmup.com/rmanquanliangzenglianghuifubaocuoora-01152-file-1-was-not-restored-from-a-sufficiently-old-backup.html
模拟过程
源端
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | dbca -silent -createDatabase -templateName General_Purpose.dbc -responseFile NO_VALUE \ -gdbname lhrdb -sid lhrdb \ -sysPassword oracle -systemPassword oracle \ -datafileDestination '+DATA' -recoveryAreaDestination 'FRA/' \ -storageType ASM -asmsnmpPassword oracle -diskGroupName 'DATA' \ -characterset ZHS16GBK -nationalCharacterSet AL16UTF16 \ -redoLogFileSize 50 \ -sampleSchema false \ -memoryPercentage 10 \ -databaseType OLTP \ -emConfiguration NONE \ -nodeinfo hqomsa,hqomsb -- 全备 ORACLE_SID=lhrdb1 create table t0 as select * from dual; run { backup INCREMENTAL LEVEL 0 as compressed backupset database format '/rman/%d%U.full'; sql 'alter system archive log current'; backup archivelog from time 'sysdate-5/24/60' format '/rman/%d%U.arc'; backup current controlfile format '/rman/%d%U.ctl'; backup spfile format '/rman/spfile_%d%U.ora'; } -- 切归档 alter system archive log current; create table t1 as select * from dual; -- 删除归档 delete archivelog all; -- 增量 alter system archive log current; create table t2 as select * from dual; run { backup INCREMENTAL LEVEL 1 as compressed backupset CUMULATIVE database format '/home/oracle/bk/%d%U_inc.full'; sql 'alter system archive log current'; backup as compressed backupset archivelog from time 'sysdate-2/24/60' format '/home/oracle/bk/%d%U_inc.arc' section size 100G; backup current controlfile format '/home/oracle/bk/%d%U_inc.ctl'; backup spfile format '/home/oracle/bk/spfile_%d%U.ora'; } |
目标端恢复
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | -- 全量恢复 ORACLE_SID=lhrdb1 startup nomount restore spfile to '+data/spfilelhrdb.ora' from '/rman/spfile_LHRDB0d0mvhuk_1_1.ora'; echo "spfile='+data/spfilelhrdb.ora'" > /u01/app/oracle/product/12.2.0/dbhome_1/dbs/initlhrdb1.ora startup force nomount restore controlfile from '/rman/LHRDB040mvh7p_1_1.ctl'; alter database mount; list backupset of datafile 1; restore database; recover database; report schema; -- thread 2 with sequence 4 and starting SCN of 966683 -- 增量恢复 startup force nomount restore controlfile from '/home/oracle/bk/LHRDB0c0mvhhr_1_1_inc.ctl'; alter database mount; list backupset of datafile 1; report schema; alter database rename file '+DATA/lhrdb/datafile/system.289.1097843163' to '+DATA/LHRDB/DATAFILE/system.354.1097851967'; alter database rename file '+DATA/lhrdb/datafile/sysaux.290.1097843163' to '+DATA/LHRDB/DATAFILE/sysaux.355.1097851967'; alter database rename file '+DATA/lhrdb/datafile/undotbs1.291.1097843163' to '+DATA/LHRDB/DATAFILE/undotbs1.356.1097851967'; alter database rename file '+DATA/lhrdb/datafile/users.292.1097843163' to '+DATA/LHRDB/DATAFILE/users.358.1097851967'; alter database rename file '+DATA/lhrdb/datafile/undotbs2.300.1097843257' to '+DATA/LHRDB/DATAFILE/undotbs2.357.1097851967'; recover database; alter database disable block change tracking; alter system set cluster_database=false scope=spfile; alter database open resetlogs; -- 若有升级,后续做其它升级操作 startup upgrade; |
相关文章
之前一直在itpub上发表的,后来被官方删的差不多了,保留的文章的图片也看不见了,只能在其它平台找同步的文章了,这也是我下定决心自己搭建博客平台的一个很大原因:
【DATAGUARD】物理dg在主库丢失归档文件的情况下的恢复(七): https://blog.csdn.net/lihuarongaini/article/details/47948803
【故障处理】DG环境主库丢失归档情况下数据文件的恢复: https://www.cnblogs.com/lhrbest/articles/5897530.html
【故障处理】DG归档丢失的恢复:https://developer.aliyun.com/article/283937
总结
1、做任何的restore和recover操作之前,或者只要还原了控制文件后,需要查看控制文件中记录的备份信息是否正确,若很杂乱,那么需要删除控制文件中记录的备份信息后,然后再做恢复操作,清除方法如下:
1 2 3 4 5 6 7 8 9 | list backupset of datafile 1; EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(11); /** CLEAR V$ARCHIVED_LOG */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(12); /** CLEAR V$BACKUP_SET */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(13); /** CLEAR V$BACKUP_PIECE */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(17); /** CLEAR V$BACKUP_CORRUPTION */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(18); /** CLEAR V$COPY_CORRUPTION */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(19); /** CLEAR V$DELETED_OBJECT */ EXECUTE DBMS_BACKUP_RESTORE.RESETCFILESECTION(28); /** CLEAR V$RMAN_STATUS */ |
2、确保做自己做的全量+增量备份之间,没有其它人或启动定时任务或NBU自动做的全量备份或增量备份。这点很重要,否则最后肯定不能恢复。切记切记!!!
3、做增量恢复之前,先执行report schema操作,获取最新的文件名!若万一忘记执行report schema操作,那么最新的文件名还可以通过告警日志文件来获取,不过需要自己来编辑:more alert_mis.log | grep datafile
,获取结果后,通过列编辑或Excel等来快速编辑。
4、此类问题通过重建控制文件并不能解决,通过BBED等特殊手段,客户是不认的!!!