(Newbie question) Did i handle my system crashing correctly?
Ive just installed Linux (Fedora 40 KDE) on my main PC over the weekend, so im a complete newbie and i apologize if some of my questions are nonsensical đ . Yesterday evening the system seemed to completely lock up at a certain point while playing Red Dead Redemption 2 for the first time (installed & run via steam using proton experimental). Id love to know if i handled this situation correctly and how to avoid this or handle it more gracefully in the future. Ill begin by recounting what happened and then ask my questions:
The game froze during a cutscene and continued to play audio for a bit after it froze visually but then that stopped too. I have two monitors, the second completely black screened and the first one was frozen on the last frame of the game. As far as i could tell nothing in KDE was still responding to normal key presses or the mouse.
After a some searching online i decided to try through the ctrl + alt + (f2, f3, ... , f6) key combinations to get into a console, that didnt work. As a last resort i tried alt + sysreq (print screen) + REISUB to safely reboot it. That ALSO didnt work, it was p. damn late in the day so i just decided to risk it and use the power button on my pc.
I was prepared for it not to boot anymore due to data corruption or sth, but it seemed mostly fine? My KDE panels were slightly messed up (but that took like 10 sec to fix) and besides that the only odd thing i've found so far is that steam refused to start properly and i had to reinstall it.
So did i handle this situation correctly? Specifically:
did alt + printscreen + REISUB save my system or do nothing? As i said it didn't reboot when i did it so i thought it was useless. But after i forcibly restarted my pc and looked it up some more it seems all but alt + printscreen + S may have been disabled, so was alt + printscreen + S responsible for my system still starting without too many problems after i forcibly shut it down?
why did this happen & how to prevent it? My system should b powerful enough to run RDR2 (Radeon RX 6800, Ryzen 5 5600X, 32GB ram) and i had nearly no problems up until the crash. So whats at fault? On protondb RDR2 has p. good ratings, did i just get unlucky and found one of the few edge cases where it breaks? But even then, why would a proton/game crash take seemingly the whole OS with it?
is it a bad or a good idea to try and trigger this again on purpose? Id really like to know if this was a freak accident or a consistent problem (and if its consistent if eg. switching to proton 9.0.1 alleviates it). So was i lucky that nothing on my PC got badly damaged from this incident and i shouldn't try to trigger it again for fear of permanent damage? Or can i expect that having to reinstall Steam everytime it crashes is the worst that could happen while testing this?
UPDATE:
I went back and did the same part of the game again but this time running it with proton 9.0.1 and the crash still occurred and in the exact same spot in the cut scene too. For reference, it crashed both times during this cutscene: https://www.youtube.com/watch?v=7UHv0SiVhWY @ around 1:23 when the explosion goes off (i only get to hear it briefly the visuals freeze seemingly just before it explodes).
Trying ctrl + alt + f keys didn't seem to do anything again. I had at least enabled the sysreq keys and REISUB appeared to work and got me back into the system this time without having to adjust KDE panels or reinstalling Steam. Visually the crash was a little different this time, i hit win/meta soon after it happened which after a second or two exchanged the stuck game visuals for a half cutoff browser window on my main monitor (and black otherwise) and my secondary monitor was filled with black and white noise with a bit of color in between.
UPDATE 2 (17/06/2024):
I tried it again for the first time since the original post, im now on Kernel 6.9.4 and the crash occured in the exact same spot and looking more or less as described in the previous instances. I managed to get back into a normal state due to alt + sysreq + i (alt + sysreq + k didnt seem to have had any effect).
UPDATE 3 (16/09/2024):
I've tried it again, proton 9.0-2 and kernel 6.10.9 and its still crashing at the exact same pont as usual. Only difference is that this time alt + sysreq + REIB didnt seem to have any effect. Tho i might have forgotten "I" now that i think about it again. I had to do a hard restart using the power button, but it doesnt seem like anything broke.
UPDATE 3.5 (16/09/2024):
Tried the next newest proton version steam has (experimental). Now the dialogue during the gameplay bit just before the cutscene doesnt trigger, then the game goes into "cutscene mode" (i think, i get black bars top and bottom and the menu becomes unavailable) but no cutscene plays and i (presumably) get softlocked. I tried waiting in case it was playing but i didnt see, i waited 5 min or so and it never ended.
you probably got a kernel panic, which froze the system. it's like a BSOD on windows, except on linux, there isn't a proper stack to handle them when they happen while you have a graphicam session running, so it kinda just freezes
i don't think reisub would do anything, because the kernel was probably already dead
you don't risk corrupting much data by hard-reseting your pc on linux -- journaling filesystems, like ext4 or btrfs, are built to be resilient to sudden power loss (or kernel crashing). if a program was writing a file at thz time the kernel crashed, this one file may be corrupted, because the program would get killed before it finished writing the file, but all in all, it's pretty unlikely. outside of fs bugs, which are thankfully few and far between on time-tested filesytems like ext4, you shouldn't have to worry much about sudden power loss!
unfortunately, figuring out the cause of these issues can be challenging -- i've had many such occurences, and you have no logs to go off of (because the system doesn't have time to save them), so you'd most likely need to figure out a way to send your kernel logs onto another system to record them
as general mitigation steps, you should try monitoring your cpu temperature a bit closer - it could be high temperature tripping the safeties of your motherboard/cpu to avoid physical damage to them - in which case, try installing a daemon to control your cpu frequency, like auto-cpufreq, or something like thermald specifically made to throttle your cpu if it gets too hot (though i think that one is intel specific)
Well, it ends everything, including the graphics driver. So the GPU just keeps running and showing the same thing, because nothing has told it otherwise.
This is also why the audio will continue and then either buzz or loop a short segment. It plays out whatever it has in its buffer, then loops. If the looping segment is long enough, it might be recognizable. If it's very short, it sounds like a buzz.