[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Disk/file system trouble




i'll tell you about something a little similar i once encountered -
there was this linux box that crashed one day, and after rebooting and
fsck-ing, would get to a stuck stage at a certain point after each reboot
- i.e. the normal boot messages would be displayed, and after filling
about 3/4 of the screen, it'd appear to get stuck.

after two hours of playing with it (preparing rescue disks, doing fsck-s,
etc.), we tried connecting to it from the network - it worked. i.e. the
machine was up and running all the time, but just did not show the boot
messages properly after some point. so i thought that perhaps a process
gets stuck with the /dev/console file, i checked out, and found that its
device major number was changed to some odd value... apprently, during the
crash, this specific file got damaged.. i removed this file and re-created
it (mknod ...), and after rebooting the machine again, it worked
flawlessly.

the morale of this story:  if your system had an odd crush, running fsck
does NOT fix your disk. if files got damaged, it will only be able to
reclaim lost inoeds (and place as files under /lost+found), fix inode
counts, etc. the only part that can be fixed is the super-block, for which
several copies are usually kept on disk. thus, unless you're so lucky as
to find out exactly what went wrong, your only safe solution is to
re-install the base system, and then load everything else from backup.
even if you fix something manually and manage to boot properly, you won't
be sure that there are no corrupt files on disk. that's the price of not
having a journalling file system, i guess...

guy