One of my non-geeky friends is currently trying Linux on his laptop. For some reason, he stopped it incorrectly and on the next reboot a fsck was forced and the fsck failed… he gets this:
*** An error occurred during the file system check.
*** Dropping you to a shell; the system will reboot
*** when you leave the shell.
Give root password for maintenance
(or type Control-D to continue):
Obviously, at this point, he had no idea what to do… He couldn’t boot into his system and he had no idea how to fix it.
THIS IS REALLY REALLY STUPID AND BAD!
And it seems all distributions are doing it, including the supposedly user friendly ones like Fedora and Ubuntu. All non-geeks will be completely lost at this point…
What should be done is: “Repairing your file system may cause errors!!! Do you want to: [F]orce Repair, get to a [S]hell or [R]eboot?” .. ideally in a nice curses front-end. And in the force repair case, just run “fsck -y”.
18 September 2008 à 2:48 pm
Yep, it’s really _really_ stupid. Because at that point, most non-technical users I know, will probably shut down the computer and try to start it again after a few minutes. But that won’t make a difference, obviously. So, it’s a dead end.
There are still a bunch of things like this scattered all over desktop linux that make it’s adoption hard. I mean, linux is really user friendly as long as everything works. But when they don’t, you better know a geek who can come to your aid. I also have a feeling that these things are easily over looked by developers making it rather hard to find the corner cases.
18 September 2008 à 3:41 pm
Just don’t ask the question at all! It’s a huge bug it’s needed.
It’s the same with richard ‘popup dialog’ hughsient. Just help the user, no question asked.
18 September 2008 à 3:44 pm
@anon: The reason it stops at all, is that the modifications can cause corruption, so there may be smarter ways to manually deal with it.. So being able to get a shell may be very useful too (if you are an expert).
18 September 2008 à 4:56 pm
The problem there is that you need to ask a question. If you don’t ask a question, there is the potential for data loss depending on an automatic action. But to ask a question in a fancy way, you need to be able to load things like the ncurses library off / which may be corrupted and unreadable. Its a serious problem. Other OSs don’t do it much better, and that is what technical support is for, be it the geek friend or something your payed for.
18 September 2008 à 4:56 pm
Why can repairing the filesystem cause errors? I’ve never seen Windows’s Scandisk warning that it can destroy the filesystem, so why is fsck more “dangerous”?
18 September 2008 à 5:27 pm
Hongli Lai: AFAIK Windows Scandisk will only ask if it actually found a problem and wants to repair it. Usually there is nothing wrong with the filesystem after a hard shutdown, but sometimes Scandisk finds corrupted parts, and asks then.
18 September 2008 à 5:38 pm
There is a bug in the Launchpad about this issue: https://bugs.launchpad.net/ubuntu/+source/ubuntu-meta/+bug/58430
That may be the best place to have this conversation.
I completely agree that this is a serious issue that should be address. If I wasn’t quite experienced (unfortunately) with fsck et al, I’d be completely lost just as your friend was.
18 September 2008 à 6:53 pm
“A serious error that cannot be handled has occured; do you want it to be handled?” Errr, wait, hang on…
18 September 2008 à 7:45 pm
@matthew: It can be handled, its just that there is a risk of further corruption (comparing to fixing it by hand with debugfs and other such expert tools..)
18 September 2008 à 7:57 pm
I’ve filed a bug with Fedora too.. https://bugzilla.redhat.com/show_bug.cgi?id=462804 .. lets see if they can fix their stuff
19 September 2008 à 3:01 am
I haven’t seen that prompt in _years_. It just never seem to happen with ext3. Did he somehow install it with ext2 instead?
19 September 2008 à 6:43 am
@Tester: So what you are saying is that there is that a seriously important trade-off that needs to be evaluated before any further action is taken?
19 September 2008 à 5:07 pm
@matthew: Yes, there is a trade-off… But sadly, the trade-off is between getting a filesystem expert (not your random sysadmin) and risking data loss.
20 September 2008 à 7:29 am
I’ve been inspired to finally go around and set FSCKFIX=yes on all of my systems.
20 September 2008 à 10:21 am
Perhaps one of two messages:
Smart Error:
Your boot drive reports that it is suffering a hardware failure. This hardware cannot be repaired and must be replaced.
Or No Smart Error / No SMART:
Your boot drive contains errors, but no hardware errors have been detected in your hard drive. This is most commonly caused by a bad hard drive, but can also be caused by a bad cable, motherboard, or RAID card. In rare cases this can be caused by problems with your computer’s system software.
Followed By:
Some or all of your data my be recoverable.
Professional data recovery service offer the best results, but these services usually extremely expensive. If you wish to use a professional data recovery service power off your computer immediately and do not turn it on again until the drive has been replaced.
If you hear strange noises coming from the drive, you should NOT turn the power off, as it is likely the drive will no longer boot and self-data recovery will become impossible. You should select the Ignore and Attempt to Boot option, and back up your data as quickly as possible.
Self-recovery is frequently successful in temporarily restoring your drive into a working state. Backup your data immediately and replace the drive, as more severe problems are likely.