[SOLVED] Help debugging an issue with Raspberry Pi 4B boot
Hi all,
I am having a strange new issue with the one of the raspberry pi 4b I have running at home. One of them failed/restarted for some reasons and it is now stuck at boot with the line:
Waiting for root device LABEL=writable...
I am booting this PI from USB. From what I can see the disk is ok. I can mount it on my laptop and access it correctly. The partition is labelled correctly. I tried to move it to another PI I have and I have the same error (I did this to remove the possibility it was the PI/USB port).
I am pretty sure it is not the power that is the issue (since I am giving it more than enough).
All of this was working correctly until now (for months). Ubuntu may have updated something (my fault, I may not have disabled the auto-update) or something else could have broken.
I can try to point to the partition via UUID instead of the label, but something tells me that is not the issue. Did anybody encounter such an issue in the past or has any advice on how to debug it?
Sorry to hear that, booting problems suck and are horrible to debug.
The next step I would try would be to boot an other install, like a liveusb or a raspbian, on the same usb port, to completely eliminate a hardware problem if it boots properly.
If it is a software problem, it seems to happen very early in the boot process, so my bet would be a corrupted initramfs/initrd (or what is equivalent on a Pi). No idea how you could debug and fix that on Ubuntu, though (especially on a Pi where /boot is… different). Maybe installing an other Ubuntu on an other disk or stick, then copy the boot files (making a backup of the original files to restore them if it doesn't help, no need to pile up possible causes of trouble). Throwing it here in case other people know better.
The next step I would try would be to boot an other install, like a liveusb or a raspbian, on the same usb port, to completely eliminate a hardware problem if it boots properly.
Good advice. I moved the usb drive from another (working) PI and attached it to the same USB port. It boots correctly. It is not the USB port nor power.
If it is a software problem, it seems to happen very early in the boot process, so my bet would be a corrupted initramfs/initrd (or what is equivalent on a Pi). No idea how you could debug and fix that on Ubuntu, though (especially on a Pi where /boot is… different).
I believe it is something like that. Or it is not mounting the drive correctly and not finding it, or it is something else. I just wish there was a better (or any) error printed on the console. I tried to attach a keyboard to get to a shell with no success.
I honestly could just reformat the drive and use a clean install, but it is the last resource. I would like to understand what happened so I can learn from it and avoid it in the future (or learn a path to fix it).
I believe it is something like that. Or it is not mounting the drive correctly and not finding it, or it is something else.
Yes, that's what I had in mind. I already had similar problems with initramfs, because it was responsible to load the drivers needed for mounting the disk where the kernel was, so a bad initramfs caused the boot to fail from the get go failing to mount the partition.
That being said, I've looked at my Pi, and I have no idea what would serve as an initramfs in all those Pi specific files in /boot, if any. If you really want to understand what happened, I guess a possible path would be to find resources on the web explaining in details the Pi boot process, since it's different from usual linux boot process (it's not just the Pi either, I played with other ARM devices, like the stuff from pine64, and they all had their own way to boot).
Hi, just writing it here in case you were curious of the cause of the problem.
I finally had some time today to work on it. What I tried was to copy the content of the system boot partition of another PI, one that I configured at the same time/day, and compared file by file with the content of the partition in the broken one.
Now there were some diff in the some binary files as expected, but what surprised me was that one file was ONLY present in the working one...
initrd.img XD
Now, don't ask me why the hell the file was not there. Maybe it got corrupted somehow, since nothing touches this partition as far as I am aware aside at boot time.
Luckily, there was .bak file present in the partition which I renamed and... it worked.
Lesson learned: I now have a copy of the boot partition of every PI I managed (it is only 167MB pre tar-zip, so it is not a cost) and I have it backed up safely on another system. Should another file get corrupted in the future (maybe this time without a .bak) I have an older working copy if it, and I can restore the service without the need to format everything.
Ahah, so that's what initrd are called on Pi. :P Good catch!
Funny enough, I don't have such file on mine, the only *.img I have is the kernel, kernel8.img. I guess it's OS specific.
The .bak things sound like an interrupted update, or something. Like if the updater moved the current initrd as a backup file, then started building the new initrd and crashed or was rebooted before completion. That's what I really dislike about automatic updates, I prefer to be sure to know it's running, and see the output. :)