Author Topic: Linux copies Windows, now has the Blue Screen of Death!  (Read 2735 times)

0 Members and 1 Guest are viewing this topic.

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Linux copies Windows, now has the Blue Screen of Death!
« on: June 28, 2024, 01:39:51 pm »
Seriously, I implemented something similar on XINU, I mean a very simple graphical item (x_console(module_id, function_id, msg, x,y)) to show some info in case of panic, because otherwise if the kernel panics you only knows this by looking at the serial console, but now Linux gets its own Windows-style Blue Screen of Death!

DRM_panic() is included in experimental kernel >=6.10, for a few DRM modules, showing some info about the panic (regs? status? syscall_trak?) and prompting users to reboot, and its usefulness will be further expanded in the near future.

 :D :D :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: pdenisowski

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #1 on: July 05, 2024, 03:26:55 pm »
any comments?
as strange as it may sound, I thought this was a rather funny news  :-//

technically, however, I find it very interesting, especially on devices that do not have easy access to the serial console
just to name one... the Teres1 laptop that I'm still working on
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline garrettm

  • Frequent Contributor
  • **
  • Posts: 274
  • Country: us
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #2 on: July 26, 2024, 06:59:42 am »
I think this is a nice feature. I've had a couple of hiccups with my Intel i9 11th gen Rocket Lake CPU and z590 motherboard that this might have helped with (assuming it actually tells people what caused the kernel panic or system to hang).

Basically the Gigabyte motherboard's FW didn't disable the Intel integrated audio on the PCH and the kernel tries to route an interrupt for it, but its broken in FW and the only solution is to stop the snd_hda_intel kernel module from loading--and thereby just ignore the non-functional HW. The board uses a USB connected Realtek part, so disabling that module doesn't hurt anything. I probably should have submitted a kernel bug since this causes extreme loading of the CPU (100% load on a single core) from some sort of infinite loop  while trying to route it an IRQ or whatever. Then there is the broken C state support on Rocket Lake and Comet Lake CPUs. Basically the system will hang if the C state goes below 4 or 5, but maybe it's 6? I forgot. Regardless, it's pretty annoying and basically wastes power but whatever. Intel clearly doesn't care and the kernel maintainers keep refusing patches that attempt to correct the issue. So proper power management is just not going to be a thing for the 10th and 11th gen desktop parts. And lastly, my WDC 850X NVMe has broken PCIe power management on any OS and I have to use pcie_aspm=off to avoid "correctable error" messages from flooding dmesg. It's all a hot mess of broken FW and HW quirks that have taken a good while to figure out but everything is finally "working" well enough that I'm satisfied enough to not go Bob Pease and throw it off a parking garage roof to its doom.

Anyways, currently, for most modern computers, the screen just freezes. So seeing why it locked up would be nice. Also almost all non-server / IoT / embeded parts (i.e. nearly all desktops/laptops/NUCs) don't properly configure the watchdogs built into the CPU / motherboards. So they can't trigger an automatic reboot: We just have frozen screens. Lovely.
« Last Edit: July 26, 2024, 07:06:46 am by garrettm »
 
The following users thanked this post: DiTBho

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #3 on: August 01, 2024, 12:13:31 pm »
More progress with kernel 6.10 :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline madires

  • Super Contributor
  • ***
  • Posts: 8138
  • Country: de
  • A qualified hobbyist ;)
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #4 on: August 01, 2024, 01:35:54 pm »
I'd prefer a Guru Meditation . :D
 
The following users thanked this post: DiTBho

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15292
  • Country: fr
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #5 on: August 01, 2024, 09:26:32 pm »
I'd prefer a microkernel.
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #6 on: August 02, 2024, 04:02:28 am »
I'd prefer a microkernel.

Linux 6.10 does not run well on HPPA
It has changed a lot since v5 ...
I don't think it will be microkernel, not like Haiku
which I don't know ...
... maybe one day it could even replace Linux  :-//
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline guenthert

  • Frequent Contributor
  • **
  • Posts: 754
  • Country: de
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #7 on: August 02, 2024, 07:42:20 am »
I'd be somewhat surprised, if the machine will be able to reliably perform complex operations like rendering a window in the GUI (X11, wayland, ...) when the system is in such a bad state that the kernel considered a panic justified.  MS Windows doesn't attempt to do that either, but instead falls back to a very simple screen driver.

More useful IMHO, and probably with greater chance of success, would be to write the panic message into a (perhaps previously allocated) permanent storage, so that it can be read and displayed on next boot in a then sane environment.  The user can then copy&paste that message, which eases searching for such occurrences, and perhaps already found solutions, on the Internet. Afaik, that has been deemed not possible (when the kernel panicked, it can't be trusted to write to stable storage or much anything, hence the panic).  I'm sure, that has been discussed before, but you probably want to discuss such on LKML, rather than here.

Luckily, this happens rarely.  I'm a "Linux first" user and I don't recall, when I've seen the last panic, much less one which wasn't caused by flawed hardware, but then, I haven't lived on the bleeding edge in years either.
« Last Edit: August 02, 2024, 07:48:02 am by guenthert »
 
The following users thanked this post: DiTBho, RobtP

Online magic

  • Super Contributor
  • ***
  • Posts: 7189
  • Country: pl
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #8 on: August 02, 2024, 07:53:23 am »
There was an idea to store panic messages in some "UEFI nonvolatile storage", until a bunch of Samsung (IIRC) laptops were found to brick when the storage was actually written to by anyone. I suspect it's still possible, but distributions probably disable it by default.
 
The following users thanked this post: DiTBho

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #9 on: August 02, 2024, 10:45:05 am »
HPPA has a little nvram, usually used to to store the logs of hw failures.

It can be used to store other info, such as kernel logs.

I usually use the serial (console) to store logs on a uart data logger.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline pdenisowski

  • Frequent Contributor
  • **
  • Posts: 912
  • Country: us
  • Product Management Engineer, Rohde & Schwarz
    • Test and Measurement Fundamentals Playlist on the R&S YouTube channel
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #10 on: August 02, 2024, 10:49:11 am »
any comments?
as strange as it may sound, I thought this was a rather funny news  :-//

I thought it was hilarious :) 

"Kernel panic, core dumped" is one of my favorite memories of my early (R&D) engineering career when I used to write a lot of code. 

I used to save the core dumps and send them as email attachments to spammers in the hopes of using up their mailbox quota ... good times :)
Test and Measurement Fundamentals video series on the Rohde & Schwarz YouTube channel:  https://www.youtube.com/playlist?list=PLKxVoO5jUTlvsVtDcqrVn0ybqBVlLj2z8
 

Offline DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4230
  • Country: gb
Re: Linux copies Windows, now has the Blue Screen of Death!
« Reply #11 on: August 02, 2024, 12:52:27 pm »
"Kernel panic, core dumped" is one of my favorite memories of my early (R&D) engineering career when I used to write a lot of code. 

I implemented something similar even for my libraries  :D

lib_debug_v2 contains panic(fid, mid, reason)
     uint32_t fid = function_id
     uint32_t mid = module_id(1)
     safestring_t reason which may contains regs and other details

I also use this system on Xinu, a rather educational kernel that did include neither a panic primitive, nor a graphic stuff; now there is a panic(), and I also added one in graphic mode.

It is much-much simpler than on linux. You don't even have a MMU, you have "tasks" instead of "processes", and you do not go through DRM, and there is not even X11 or Wayland there is a video driver that exposes the graphic LCD (320x240, comes with a built-in controller) as a very simple framebuffer, it is just a huge array, where each 32bit cell represents 1 pixel { R0..7, G0..7, B0..7, A0..7 }, and there are very simple primitives that draw lines, squares, solid fill, and text with only 3 types of fonts, for equally spaced characters.

The kernel/panic.c:panic() does nothing but writes the error information in a list of buffers
usually only two buffers
  • the system console, on which always and only the UART driver operates
  • the graphic console, on which to promote the primitives mentioned above the result is that you see on the LCD that the kernel has gone into panic

When something catastrophically breaks, e.g. the filesystem has a bug so serious that it cannot continue (without causing more data corruption), then it invokes panic(), which halts the system, but typically before invoking halt(), you still have the serial and the graphical display functional, at least with the simplest primitives

This is good because the LCD immediately displays the blue screen of death, indicating that the system has gone into panic, and you can investigate on the system console, where you will also have all the details saved in a file thanks to minicom which can record.

Very nice, and very useful! I really like it  :D

(1) "module" means ... the physical C file to which a function belongs
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: pdenisowski


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf