Home » RDBMS Server » Backup & Recovery » Recovering from Corrupted blocks in my data file
Recovering from Corrupted blocks in my data file [message #250375] Mon, 09 July 2007 18:53 Go to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
Hi,

I'm hoping somebody could be of some assistance. I'm running ORACLE 9.2.0.1.0 on Solaris.

It appears that I have some corrupted blocks in one of my datafiles as can be seen from the output in my alert logs.

Mon Jul 9 19:38:37 2007
Errors in file /oracle/OraHome92/admin/IAC/bdump/iac_smon_7699.trc:
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
ORACLE Instance IAC (pid = 6) - Error 600 encountered while recovering transaction (10, 24) on object 39909.
Mon Jul 9 19:38:39 2007
Errors in file /oracle/OraHome92/admin/IAC/bdump/iac_smon_7699.trc:
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
Recovery of Online Redo Log: Thread 1 Group 2 Seq 17447 Reading mem 0
Mem# 0 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_1/redo02_01.log
Mem# 1 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_2/redo02_02.log
Mem# 2 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_3/redo02_03.log

Unfortunately we have two serious problems complicating the recovery - this corruption has been in the datafile for quite a long time (but went unnoticed because it didn't seem to be causing any problems with our application until the archive logs were "cleaned up"). Which gets us to the second problem - I'm not sure how far back I would have to go for a clean datafile but I'm positive that I don't have a complete set of archive logs to restore from anyway.

So it appears that a restore and recovery is not an option. Restoring to the point in time of my last clean datafile also is not really acceptable.

So right now I think I'm in damage control mode. I've examined the corrupted blocks and they all point to the same table in my database called "zbindb".

I'm wondering if somehow I could save the data in the uncorrupted blocks of this table? There are enough corrupted blocks that the scripts I've seen to extract rowids from either side of corrupted blocks would likely be impractical.

One solution I've considered after a lot of reading involves DBMS_REPAIR.SKIP_CORRUPT_BLOCKS.

Can I: run that against the table thereby marking the corruption, perform an exp of the clean data, drop the table, imp the tables data back into the database.

Does this seem like a feasible solution? Any alternatives?

Thanks in advance.

[Updated on: Mon, 09 July 2007 18:54]

Report message to a moderator

Re: Recovering from Corrupted blocks in my data file [message #250376 is a reply to message #250375] Mon, 09 July 2007 19:04 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Subject: Handling Oracle Block Corruptions in Oracle7/8/8i/9i/10g
On metalink - Doc ID: Note:28814.1
Re: Recovering from Corrupted blocks in my data file [message #250377 is a reply to message #250376] Mon, 09 July 2007 19:23 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 09 July 2007 20:04
Subject: Handling Oracle Block Corruptions in Oracle7/8/8i/9i/10g
On metalink - Doc ID: Note:28814.1



Unfortunately we don't have direct oracle support and thus access to metalink. (ORACLE is licensed to us as part of a third party arrangement and they provide support).

They've passed on some metalink info particularly 61685.1 which is an extension to 28814.1

One of the things it mentioned in here is "Use a special event which can SKIP over corrupt blocks. This is by far the simplest option to extract table data and is discussed in Note 33405.1. Note that this event can only be used if the corrupt block reports ORA-1578."

I had read online about somebody using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS paired with imp/exp for this task so I guess I'm trying to find some validation that this is valid.

Our in house DBA knowledge is extremely limited.

Re: Recovering from Corrupted blocks in my data file [message #250378 is a reply to message #250375] Mon, 09 July 2007 19:35 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
I would not want to be in your shoes.
run "dbv" to determine extent of corruption.
dbv help=yes
Identify which objects contain corrupt blocks.
 The following query will tell you the object TYPE , OWNER and NAME of
  a segment given the absolute file number "&AFN" and block number "&BL" of the
  corrupt block - the database must be open in order to use this query:

        SELECT tablespace_name, segment_type, owner, segment_name
          FROM dba_extents
         WHERE file_id = &AFN
           and &BL between block_id AND block_id + blocks - 1
	;
Re: Recovering from Corrupted blocks in my data file [message #250379 is a reply to message #250378] Mon, 09 July 2007 19:49 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 09 July 2007 20:35
I would not want to be in your shoes.
run "dbv" to determine extent of corruption.
dbv help=yes
Identify which objects contain corrupt blocks.
 The following query will tell you the object TYPE , OWNER and NAME of
  a segment given the absolute file number "&AFN" and block number "&BL" of the
  corrupt block - the database must be open in order to use this query:

        SELECT tablespace_name, segment_type, owner, segment_name
          FROM dba_extents
         WHERE file_id = &AFN
           and &BL between block_id AND block_id + blocks - 1
	;



I don't want to be in my shoes either. Smile

I've run the dbv utility and it points to a good number of corrupt blocks. I've run the SELECT you've documented above on a number of them and everything seems to point to the same table owned by the same user.

[Updated on: Mon, 09 July 2007 19:49]

Report message to a moderator

Re: Recovering from Corrupted blocks in my data file [message #250380 is a reply to message #250375] Mon, 09 July 2007 19:52 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
Another question regarding this issue:

Is the DBMS_REPAIR.SKIP_CORRUPT_BLOCKS just a flag being flipped on and off? (ie: no real modification to my data)Basically I'm wondering if I was to set this, would it essentially allow the uncorrupted blocks in the table to be read again? Which could get my users back up and running and thus give me some breathing room.
Re: Recovering from Corrupted blocks in my data file [message #250381 is a reply to message #250375] Mon, 09 July 2007 19:57 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>I had read online about somebody using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS paired with imp/exp for this task so I guess I'm trying to find some validation that this is valid.
1) I would try table export without doing anything else.
2) DBMS_REPAIR.SKIP_CORRUPT_BLOCKS is relatively benign.
3) I would do a test import into another DB to do damage assessment.
4) At some point you'll have to DROP TABLE & then CREATE TABLE.
5) After the DROP TABLE then dbv should report a clean tablespace.
Re: Recovering from Corrupted blocks in my data file [message #250383 is a reply to message #250381] Mon, 09 July 2007 19:59 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 09 July 2007 20:57
>I had read online about somebody using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS paired with imp/exp for this task so I guess I'm trying to find some validation that this is valid.
1) I would try table export without doing anything else.
2) DBMS_REPAIR.SKIP_CORRUPT_BLOCKS is relatively benign.
3) I would do a test import into another DB to do damage assessment.
4) At some point you'll have to DROP TABLE & then CREATE TABLE.
5) After the DROP TABLE then dbv should report a clean tablespace.


I've tried the exp "as is" and it always throws errors on the corrupt table.

What ORACLE user do I run the DBMS_REPAIR.SKIP_CORRUPT_BLOCKS as? sys as sysdba? I can do it to an online database?
Re: Recovering from Corrupted blocks in my data file [message #250384 is a reply to message #250381] Mon, 09 July 2007 20:10 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 09 July 2007 20:57
>I had
2) DBMS_REPAIR.SKIP_CORRUPT_BLOCKS is relatively benign.



Do you think it's worth turning on to see if I can my users back and running while we work towards exporting, dropping and importing? It's getting late here and if possible I'd like to make some users temporarily happy in the morning.

Does DBMS_REPAIR.SKIP_CORRUPT_BLOCKS actually DO anything by itself? Some things I've read say that it will allow corrupted blocks to be skipped when used by itself - essentially as an alternative to FIX_CORRUPT_BLOCKS etc... Other places seem to imply that you first need to FIX_CORRUPT_BLOCKS and THEN you can SKIP_CORRUPT_BLOCKS.

[Updated on: Mon, 09 July 2007 20:13]

Report message to a moderator

Re: Recovering from Corrupted blocks in my data file [message #250385 is a reply to message #250375] Mon, 09 July 2007 20:52 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
The classic advice is cases like this one is to shutdown the DB & take a complete cold backup of the system NOW!.
The reason is that so you can at least get back to this state should things go from bad to worse.
Re: Recovering from Corrupted blocks in my data file [message #250390 is a reply to message #250385] Mon, 09 July 2007 21:48 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 09 July 2007 21:52
The classic advice is cases like this one is to shutdown the DB & take a complete cold backup of the system NOW!.
The reason is that so you can at least get back to this state should things go from bad to worse.


Thats my first task tomorow morning before changing ANYTHING.

Smile
Re: Recovering from Corrupted blocks in my data file [message #251827 is a reply to message #250375] Mon, 16 July 2007 16:29 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
We've worked with our application vendor's ORACLE experts to recover the data around the corruption and did so with a minimum of data loss.

Unfortunately we've had the problem re-occur after doing this twice now.

First time:

We used the DBMS_REPAIR utilities to block off the corruption, allowing us to get a clean exp.. We created a new table and imp all of the exported data to it and then dropped the corrupted table and renamed the "new" one.
Within a day this table was corrupted again.

Second time:
We extracted the "good" data from the table using a select on the index and copied this data to a new table. Then dropped the corrupted table. For good measure we did an exp of the ORACLE user/tablespace and dropped the original(including data files).. recreated and imported the data. Within 3 days the table was again corrupted.

I believe (with my limited knowledge of ORACLE) that the steps we took should have ruled out any kind of logical problem with the data as well as physical problems with the datafiles etc.

And the fact that it's the SAME table in the same user/tablespace every time seems to indicate that the application is generating the corruption (for whatever reason).

The application is using this table to store LONG information for rows of other tables. We're actually using below the Vendor reccomended ORACLE patchset so they're now suspecting that this could be the problem. Yes, it's an easy "out" on their part but they've been good to us so far and everybody is running out of ideas.

Does anybody know of any problems handling LONG data in the 9.2.0.1 version? I think my company is finally going to spring for the money to have someone in to do the upgrade (we have a complicated configuration) but I'm just curious if we can expect this to fix the problem.


Re: Recovering from Corrupted blocks in my data file [message #251829 is a reply to message #250375] Mon, 16 July 2007 17:02 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Having some better details (such as Oracle error code) would make the search for "known bugs" easier.

Why are you the only customer of this vendor that is having this problem?

What changed between when it was working OK & when the problem started happening?

Re: Recovering from Corrupted blocks in my data file [message #251837 is a reply to message #251829] Mon, 16 July 2007 18:07 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
anacedent wrote on Mon, 16 July 2007 18:02
Having some better details (such as Oracle error code) would make the search for "known bugs" easier.

Why are you the only customer of this vendor that is having this problem?

What changed between when it was working OK & when the problem started happening?




At home now but will pull some more error output tomorrow. We are still seeing the same errors in our alert logs as I posted in the original post. Not showing the object number anymore though and the parameters looks like this right now: [6856], [0], [5],

Regarding other customers... This particular software is highly specialized and they have smaller numbers of large customers. They've recently released a new version and I believe that we're one of the few customers using it live in production right now. Even more likely we're probably the only customer running live on SOLARIS. Also, different customers do different things and there are various configurations.. We occasionally do discover bugs that are platform or configuration specific. This is the first release that has recommended the 9.2.0.4.0 patch level.

What has changed - hard to say. The initial instance of corruption has been around so long it's difficult to determine.
Re: Recovering from Corrupted blocks in my data file [message #252053 is a reply to message #250375] Tue, 17 July 2007 13:27 Go to previous messageGo to next message
ShaV
Messages: 20
Registered: July 2007
Junior Member
So here is the error being seen in my alert logs:


Tue Jul 17 14:18:40 2007
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Tue Jul 17 14:18:40 2007
Errors in file /oracle/OraHome92/admin/IAC/bdump/iac_smon_6506.trc:
ORA-00600: internal error code, arguments: [6856], [0], [5], [], [], [], [], []
ORACLE Instance IAC (pid = 6) - Error 600 encountered while recovering transaction (10, 41) on object 57812.
Tue Jul 17 14:18:42 2007
Errors in file /oracle/OraHome92/admin/IAC/bdump/iac_smon_6506.trc:
ORA-00600: internal error code, arguments: [6856], [0], [5], [], [], [], [], []
Recovery of Online Redo Log: Thread 1 Group 3 Seq 17689 Reading mem 0
Mem# 0 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_1/redo03_01.log
Mem# 1 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_2/redo03_02.log
Mem# 2 errs 0: /dbs/OraHome92/oradata/IAC/log_grp_3/redo03_03.log


The trace file is showing:

MON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
*** 2007-07-17 03:02:58.401
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [6856], [0], [5], [], [], [], [], []

dbv logs show:

Block Checking: DBA = 62940082, Block Type = KTB-managed data block
data header at 0x14ba94
kdbchk: avsp bad (-1)
Page 25522 failed with check code 6126

DBVERIFY - Verification complete

Total Pages Examined : 256000
Total Pages Processed (Data) : 1833
Total Pages Failing (Data) : 1
Total Pages Processed (Index): 2499
Total Pages Failing (Index): 0
Total Pages Processed (Other): 8972
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 242696
Total Pages Marked Corrupt : 0
Total Pages Influx : 0

Re: Recovering from Corrupted blocks in my data file [message #275744 is a reply to message #250375] Mon, 22 October 2007 10:15 Go to previous message
ShaV
Messages: 20
Registered: July 2007
Junior Member
Everybody that replied in this thread - just wanted to thank you for your thoughts, suggestions and ideas. As an update - we patched ORACLE to 9.2.0.7.0 and then again ran through the repair process with the vendor. This time the corrupt blocks did NOT return so it looks like we have this one licked.

[Updated on: Mon, 22 October 2007 10:15]

Report message to a moderator

Previous Topic: backup from oracle to tape under windows 2000 server
Next Topic: RMAN Restoration
Goto Forum:
  


Current Time: Mon May 20 09:00:37 CDT 2024