Unix Technical Forum

lost+found recovery

This is a discussion on lost+found recovery within the comp.unix.solaris forums, part of the Solaris Operating System category; --> Hi! We lost a critical file system last week due to disk failure. Unfortunalty, we were told the file ...


Go Back   Unix Technical Forum > Unix Operating Systems > Solaris Operating System > comp.unix.solaris

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-05-2008, 07:59 PM
Nanda H MT A3 4F36 420-7109
 
Posts: n/a
Default lost+found recovery


Hi!
We lost a critical file system last week due to disk failure.
Unfortunalty, we were told the file system (on Solaris 2.6) was
never under any backup because of a typo error on the backup server
configuration.

Left with no option, we revived the disk to a usable condition and
disk started spinning but was not able to read some sectors.

I was unable to mount the volume (striped 13Gb data on a 51 Gb
Solaris UFS file system) and hence, I had to resort to fsck for fix.
I was not even able to mount 'ro'.

Ran fsck and the program dumped core after 2 days. Called Veritas
and Sun support, and I was told to give up my 'restore' plans, since
the file system in question is badly corrupted.

Did not want to give up, and finally, in the google search, I found
a patch for this core dump (I think it was 105516-06) of fsck, and
applied the patch. Fsck ran for few more days and it stopped with the
error " cannot alloc -1414555316 bytes for inphead "

Then I tried alternate 'SuperBlock' in fsck option, and fsck got past
the above error and finished the fsck process.

fsck ran for couple of days, moved all the salvaged files to
'lost+found' directory. But again, I was not able to mount. Re-ran
fsck again for few more days.

This time, I was not able to mount the file system for 'RW"
but was able to mount in 'RO' mode. Unfortunaltely, I was not
able to read any files in 'lost+found' directory.

-----------------------------------------------------------------------
# ls -la
..: No such file or directory
-----------------------------------------------------------------------

I know, there are files in lost+found directory, because when I used
'fsdb', I get the following file/dir list.

-----------------------------------------------------------------------
# fsdb /dev/vx/rdsk/homedg/users1vol2
fsdb of /dev/vx/rdsk/homedg/users1vol2 (Read only) -- last mounted on /export/u1
fs_clean is currently set to FSACTIVE
fs_state consistent (fs_clean CAN be trusted)
/dev/vx/rdsk/homedg/users1vol2 > :ls
/:
../ ../ lost+found/
/dev/vx/rdsk/homedg/users1vol2 > :cd lost+found
/dev/vx/rdsk/homedg/users1vol2 > :ls
/lost+found:
#01087115 #20530574/ #21448328/ #22437120/ #23518080/
#01087116 #20532736/ #21448331/ #22437123/ #23518083/
#01087118 #20532739/ #21448355/ #22437129/ #23518087/
#01087119 #20532740/ #21452160/ #22439296/ #23518089/
#01088645 #20532750/ #21452161/ #22439326/ #23520256/
<Huge list follows)
-----------------------------------------------------------------------

One more thing, I observed was 'lost+found' directory Inode is now 8
and not 3 (that is normally the case). My guess, is lost+found directory is
corrupt. The reason being, while running 'fsck', I observed the following
message almost for every inode on the file system.

-----------------------------------------------------------------------
DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
SIZE=1597440 MTIME=Nov 22 10:29 2003
DIR=?

SALVAGE? yes

UNREF FILE I=25610902 OWNER=gcs MODE=100640
SIZE=8486 MTIME=Feb 12 18:08 2001
RECONNECT? yes

DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
SIZE=1597440 MTIME=Nov 22 10:29 2003
DIR=?
-----------------------------------------------------------------------

I tried using 'ufsdump' and 'tar' to see if I can get the data read
by some program, but without much success. I was able to 'dd' to
another file system (Actually it is still running). When I observed
the output of dd, it looks like it has some data in it.

My question is, do I have any other option left now? Any suggestion
is very much appreciated.
Thanks in advance.

-Nanda Hullahalli
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-05-2008, 08:00 PM
Darren Dunham
 
Posts: n/a
Default Re: lost+found recovery

Nanda H MT A3 4F36 420-7109 <nanda@wnmail.att.com> wrote:

> Hi!
> We lost a critical file system last week due to disk failure.
> Unfortunalty, we were told the file system (on Solaris 2.6) was
> never under any backup because of a typo error on the backup server
> configuration.


> Left with no option, we revived the disk to a usable condition and
> disk started spinning but was not able to read some sectors.


> I was unable to mount the volume (striped 13Gb data on a 51 Gb
> Solaris UFS file system) and hence, I had to resort to fsck for fix.
> I was not even able to mount 'ro'.


#1 immediately make a copy of the data as best you can (dd?)
#2 make another copy to try any recovery efforts (fsck).

> My question is, do I have any other option left now? Any suggestion
> is very much appreciated.


There are some commercial recovery houses that can take your original
disks and work with them. Unfortunately if you've tried to fsck the
original disks, it may be much harder.

They're not cheap, but I've heard clients say they've performed miracles
for them. I don't have any particular ones I can recall named, though.

--
Darren Dunham ddunham@taos.com
Unix System Administrator Taos - The SysAdmin Company
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-05-2008, 08:01 PM
Bigdakine
 
Posts: n/a
Default Re: lost+found recovery

>Subject: lost+found recovery
>From: Nanda H MT A3 4F36 420-7109 nanda@wnmail.att.com
>Date: 11/23/03 3:23 PM Hawaiian Standard Time
>Message-id: <84dwb.103792$Ec1.4714006@bgtnsc05-news.ops.worldnet.att.net>
>
>
>Hi!
>We lost a critical file system last week due to disk failure.
>Unfortunalty, we were told the file system (on Solaris 2.6) was
>never under any backup because of a typo error on the backup server
>configuration.
>
>Left with no option, we revived the disk to a usable condition and
>disk started spinning but was not able to read some sectors.
>
>I was unable to mount the volume (striped 13Gb data on a 51 Gb
>Solaris UFS file system) and hence, I had to resort to fsck for fix.
>I was not even able to mount 'ro'.
>
>Ran fsck and the program dumped core after 2 days. Called Veritas
>and Sun support, and I was told to give up my 'restore' plans, since
>the file system in question is badly corrupted.
>
>Did not want to give up, and finally, in the google search, I found
>a patch for this core dump (I think it was 105516-06) of fsck, and
>applied the patch. Fsck ran for few more days and it stopped with the
>error " cannot alloc -1414555316 bytes for inphead "
>
>Then I tried alternate 'SuperBlock' in fsck option, and fsck got past
>the above error and finished the fsck process.
>
>fsck ran for couple of days, moved all the salvaged files to
>'lost+found' directory. But again, I was not able to mount. Re-ran
>fsck again for few more days.
>
>This time, I was not able to mount the file system for 'RW"
>but was able to mount in 'RO' mode. Unfortunaltely, I was not
>able to read any files in 'lost+found' directory.
>
>-----------------------------------------------------------------------
># ls -la
>.: No such file or directory
>-----------------------------------------------------------------------
>
>I know, there are files in lost+found directory, because when I used
>'fsdb', I get the following file/dir list.
>
>-----------------------------------------------------------------------
># fsdb /dev/vx/rdsk/homedg/users1vol2
>fsdb of /dev/vx/rdsk/homedg/users1vol2 (Read only) -- last mounted on
>/export/u1
>fs_clean is currently set to FSACTIVE
>fs_state consistent (fs_clean CAN be trusted)
>/dev/vx/rdsk/homedg/users1vol2 > :ls
>/:
>./ ../ lost+found/
>/dev/vx/rdsk/homedg/users1vol2 > :cd lost+found
>/dev/vx/rdsk/homedg/users1vol2 > :ls
>/lost+found:
>#01087115 #20530574/ #21448328/ #22437120/ #23518080/
>#01087116 #20532736/ #21448331/ #22437123/ #23518083/
>#01087118 #20532739/ #21448355/ #22437129/ #23518087/
>#01087119 #20532740/ #21452160/ #22439296/ #23518089/
>#01088645 #20532750/ #21452161/ #22439326/ #23520256/
><Huge list follows)
>-----------------------------------------------------------------------
>
>One more thing, I observed was 'lost+found' directory Inode is now 8
>and not 3 (that is normally the case). My guess, is lost+found directory is
>corrupt. The reason being, while running 'fsck', I observed the following
>message almost for every inode on the file system.
>
>-----------------------------------------------------------------------
>DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
>SIZE=1597440 MTIME=Nov 22 10:29 2003
>DIR=?
>
>SALVAGE? yes
>
>UNREF FILE I=25610902 OWNER=gcs MODE=100640
>SIZE=8486 MTIME=Feb 12 18:08 2001
>RECONNECT? yes
>
>DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
>SIZE=1597440 MTIME=Nov 22 10:29 2003
>DIR=?
>-----------------------------------------------------------------------
>
>I tried using 'ufsdump' and 'tar' to see if I can get the data read
>by some program, but without much success. I was able to 'dd' to
>another file system (Actually it is still running). When I observed
>the output of dd, it looks like it has some data in it.
>
>My question is, do I have any other option left now? Any suggestion
>is very much appreciated.
>Thanks in advance.
>
>-Nanda Hullahalli



What kind of files are you looking to unearth? Are they ascii?

Can you think of a string that would uniquely as possible define the lost file?

If so than you might consider piping the out put of dd past a version of grep
that will allow you to specify how many lines before and after the match it
will print out.

I forget which flavor of grep that is (fgrep?).

With any luck, the file(s) you're interested in were written sequentially to
disk.

If not, then you may need repeated passes to get the data back.

Stuart
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-05-2008, 08:01 PM
Srinandan Hullahalli MT A3 4F36 420-7109
 
Posts: n/a
Default Re: lost+found recovery

Bigdakine <bigdakine@aol.comgetagrip> wrote:
>>Subject: lost+found recovery
>>From: Nanda H MT A3 4F36 420-7109 nanda@wnmail.att.com
>>Date: 11/23/03 3:23 PM Hawaiian Standard Time
>>Message-id: <84dwb.103792$Ec1.4714006@bgtnsc05-news.ops.worldnet.att.net>
>>
>>
>>Hi!
>>We lost a critical file system last week due to disk failure.
>>Unfortunalty, we were told the file system (on Solaris 2.6) was
>>never under any backup because of a typo error on the backup server
>>configuration.
>>
>>Left with no option, we revived the disk to a usable condition and
>>disk started spinning but was not able to read some sectors.
>>
>>I was unable to mount the volume (striped 13Gb data on a 51 Gb
>>Solaris UFS file system) and hence, I had to resort to fsck for fix.
>>I was not even able to mount 'ro'.
>>
>>Ran fsck and the program dumped core after 2 days. Called Veritas
>>and Sun support, and I was told to give up my 'restore' plans, since
>>the file system in question is badly corrupted.
>>
>>Did not want to give up, and finally, in the google search, I found
>>a patch for this core dump (I think it was 105516-06) of fsck, and
>>applied the patch. Fsck ran for few more days and it stopped with the
>>error " cannot alloc -1414555316 bytes for inphead "
>>
>>Then I tried alternate 'SuperBlock' in fsck option, and fsck got past
>>the above error and finished the fsck process.
>>
>>fsck ran for couple of days, moved all the salvaged files to
>>'lost+found' directory. But again, I was not able to mount. Re-ran
>>fsck again for few more days.
>>
>>This time, I was not able to mount the file system for 'RW"
>>but was able to mount in 'RO' mode. Unfortunaltely, I was not
>>able to read any files in 'lost+found' directory.
>>
>>-----------------------------------------------------------------------
>># ls -la
>>.: No such file or directory
>>-----------------------------------------------------------------------
>>
>>I know, there are files in lost+found directory, because when I used
>>'fsdb', I get the following file/dir list.
>>
>>-----------------------------------------------------------------------
>># fsdb /dev/vx/rdsk/homedg/users1vol2
>>fsdb of /dev/vx/rdsk/homedg/users1vol2 (Read only) -- last mounted on
>>/export/u1
>>fs_clean is currently set to FSACTIVE
>>fs_state consistent (fs_clean CAN be trusted)
>>/dev/vx/rdsk/homedg/users1vol2 > :ls
>>/:
>>./ ../ lost+found/
>>/dev/vx/rdsk/homedg/users1vol2 > :cd lost+found
>>/dev/vx/rdsk/homedg/users1vol2 > :ls
>>/lost+found:
>>#01087115 #20530574/ #21448328/ #22437120/ #23518080/
>>#01087116 #20532736/ #21448331/ #22437123/ #23518083/
>>#01087118 #20532739/ #21448355/ #22437129/ #23518087/
>>#01087119 #20532740/ #21452160/ #22439296/ #23518089/
>>#01088645 #20532750/ #21452161/ #22439326/ #23520256/
>><Huge list follows)
>>-----------------------------------------------------------------------
>>
>>One more thing, I observed was 'lost+found' directory Inode is now 8
>>and not 3 (that is normally the case). My guess, is lost+found directory is
>>corrupt. The reason being, while running 'fsck', I observed the following
>>message almost for every inode on the file system.
>>
>>-----------------------------------------------------------------------
>>DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
>>SIZE=1597440 MTIME=Nov 22 10:29 2003
>>DIR=?
>>
>>SALVAGE? yes
>>
>>UNREF FILE I=25610902 OWNER=gcs MODE=100640
>>SIZE=8486 MTIME=Feb 12 18:08 2001
>>RECONNECT? yes
>>
>>DIRECTORY CORRUPTED I=8 OWNER=root MODE=41700
>>SIZE=1597440 MTIME=Nov 22 10:29 2003
>>DIR=?
>>-----------------------------------------------------------------------
>>
>>I tried using 'ufsdump' and 'tar' to see if I can get the data read
>>by some program, but without much success. I was able to 'dd' to
>>another file system (Actually it is still running). When I observed
>>the output of dd, it looks like it has some data in it.
>>
>>My question is, do I have any other option left now? Any suggestion
>>is very much appreciated.
>>Thanks in advance.
>>
>>-Nanda Hullahalli



> What kind of files are you looking to unearth? Are they ascii?


> Can you think of a string that would uniquely as possible define the lost file?


> If so than you might consider piping the out put of dd past a version of grep
> that will allow you to specify how many lines before and after the match it
> will print out.


> I forget which flavor of grep that is (fgrep?).


> With any luck, the file(s) you're interested in were written sequentially to
> disk.


> If not, then you may need repeated passes to get the data back.


> Stuart


Thanks Darren and Stuart for your quick response.
Yes, most of them are text files. I did 'dd' them to a file (it is 40+ Gb file).
Thanks

-Nanda

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:17 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com