General > General Technical Chat
Why aren't computers designed to handle power failure?
<< < (6/14) > >>
SiliconWizard:
As said. Cost, usefulness. Complexity. (That would be another potential point of failure and something to maintain - the batteries for instance.)

As someone else said, thanks to reasonable filesystems and reasonable OSs, most computers can survive power loss with no harm done. I've certainly experienced quite a few in the past (before I used UPSs) and never damaged anything or lost any file due to that.

Now if you want that very feature, it's easy. Buy a laptop.
pepelevamp:

--- Quote from: bd139 on June 12, 2020, 01:25:22 pm ---
Also comp sci guy. None of that really matters. All you need is a write barrier and transactional consistency. And we actually mostly have that.

In a lot of cases we have much much much better than that with journaling and nice new formats (Redis AOF I fucking love) which allow full ordered consistency.

--- End quote ---
yeah thats beautiful stuff. but saying ya only need transactional consistency is like saying 'its easy. to solve this problem all ya need is the problem solved'. how ya gonna get transactional consistency into a multi-core SMP dude-bro PC with shared and non-shared caches and no atomic operations? :(

non-volatile memory is the ticket to all dis. CPUs can shut down & live on 0.001mA no problem but not dram.

I had this old 286 with non-volatile memory. This old rugged handheld unit that weighed like over 2kg in your hand & was made of IRON & nickel or something. you could break a car window with it. ya could take out the batteries for up to 30 days and put em back in and resume ya work. No jokes. Little LCD screen with AA batteries. thats the kind of tough computer that bullies you into doing the work for it.
krish2487:
Going on a tangent here.. One reason I can think of why most UPS systems are not DC and AC is also the overall cost of transmission.
It is pretty much the same reason why transmit AC rather than DC. Not withstanding small UPS systems which you place next to your PC, even a moderately sized installation for a couple of computers or even a small office will
1. Not run off a single 12V, but a higher DC bus
2. Cost of wiring to handle will be way expensive compared to a AC installation of a similar size. All the switchgear, safety circuits, wires will need to sized for DC.

bd139:

--- Quote from: pepelevamp on June 12, 2020, 02:20:26 pm ---
--- Quote from: bd139 on June 12, 2020, 01:25:22 pm ---
Also comp sci guy. None of that really matters. All you need is a write barrier and transactional consistency. And we actually mostly have that.

In a lot of cases we have much much much better than that with journaling and nice new formats (Redis AOF I fucking love) which allow full ordered consistency.

--- End quote ---
yeah thats beautiful stuff. but saying ya only need transactional consistency is like saying 'its easy. to solve this problem all ya need is the problem solved'. how ya gonna get transactional consistency into a multi-core SMP dude-bro PC with shared and non-shared caches and no atomic operations? :(

non-volatile memory is the ticket to all dis. CPUs can shut down & live on 0.001mA no problem but not dram.

I had this old 286 with non-volatile memory. This old rugged handheld unit that weighed like over 2kg in your hand & was made of IRON & nickel or something. you could break a car window with it. ya could take out the batteries for up to 30 days and put em back in and resume ya work. No jokes. Little LCD screen with AA batteries. thats the kind of tough computer that bullies you into doing the work for it.

--- End quote ---

I disagree. You're looking at one side of the set of problems.

Memory consistency is irrelevant to the point. Your data is divided neatly into volatile and non volatile across a neat line in the storage hierarchy. You only need consistency across the volatility barrier, which you get from write barriers and transaction journaling. But that's not even the issue.

As for transactional consistency I write highly concurrent, redundant and reliable code that works across NUMA machines with 48 cores and 1Tb+ of RAM. I suggest you read Hoare's CSP paper as an introduction to my favoured approach which doesn't need transactional memory or global atomicity. The reason none of that got implemented in our current state of the art architectures is because quite frankly it sucked.

Complete non-volatility is distinctly NOT an option because you're only looking at power failure for one set of state. The entire machine's state consists of thousands of little pockets of state from IO registers, DRAM buffers, SMC registers, SPI bus transactions, PCI bus transactions in flight, CPU configuration registers etc. And we're only looking at the power failure scenario. How do we handle hardware failures or bus errors (sweep them under the carpet?). None of that can't be consistently rationalised by hardware really. As always it's easier to resolve the case of how to get from ground zero to a known state than it is to get from an unknown state to a known state.

However the short cut and the correct answer is that the only thing that matters is the intent of what you are doing and recovering that intent from your non volatile store. So all you have to do is:

1. make sure the intent is consistent on the part of the software writing to disk (sync + transaction journalling)
2. that the data is written in an order that makes sense (write barriers)
3. that the operation succeeds (last sata / nvme transaction was committed - hold up capacitors)

This works for more than just power failure scenarios. Fires, halon dumps, APC UPS issues, wars, dodgy fibre cables and laptop PMC's giving up before the battery is dead.
pepelevamp:

--- Quote from: bd139 on June 12, 2020, 02:41:07 pm ---
--- Quote from: pepelevamp on June 12, 2020, 02:20:26 pm ---
--- Quote from: bd139 on June 12, 2020, 01:25:22 pm ---
Also comp sci guy. None of that really matters. All you need is a write barrier and transactional consistency. And we actually mostly have that.

In a lot of cases we have much much much better than that with journaling and nice new formats (Redis AOF I fucking love) which allow full ordered consistency.

--- End quote ---
yeah thats beautiful stuff. but saying ya only need transactional consistency is like saying 'its easy. to solve this problem all ya need is the problem solved'. how ya gonna get transactional consistency into a multi-core SMP dude-bro PC with shared and non-shared caches and no atomic operations? :(

non-volatile memory is the ticket to all dis. CPUs can shut down & live on 0.001mA no problem but not dram.

I had this old 286 with non-volatile memory. This old rugged handheld unit that weighed like over 2kg in your hand & was made of IRON & nickel or something. you could break a car window with it. ya could take out the batteries for up to 30 days and put em back in and resume ya work. No jokes. Little LCD screen with AA batteries. thats the kind of tough computer that bullies you into doing the work for it.

--- End quote ---

I disagree. You're looking at one side of the set of problems.

Memory consistency is irrelevant to the point. Your data is divided neatly into volatile and non volatile across a neat line in the storage hierarchy. You only need consistency across the volatility barrier, which you get from write barriers and transaction journaling. But that's not even the issue.

As for transactional consistency I write highly concurrent, redundant and reliable code that works across NUMA machines with 48 cores and 1Tb+ of RAM. I suggest you read Hoare's CSP paper as an introduction to my favoured approach which doesn't need transactional memory or global atomicity. The reason none of that got implemented in our current state of the art architectures is because quite frankly it sucked.

Complete non-volatility is distinctly NOT an option because you're only looking at power failure for one set of state. The entire machine's state consists of thousands of little pockets of state from IO registers, DRAM buffers, SMC registers, SPI bus transactions, PCI bus transactions in flight, CPU configuration registers etc. And we're only looking at the power failure scenario. How do we handle hardware failures or bus errors (sweep them under the carpet?). None of that can't be consistently rationalised by hardware really. As always it's easier to resolve the case of how to get from ground zero to a known state than it is to get from an unknown state to a known state.

However the short cut and the correct answer is that the only thing that matters is the intent of what you are doing and recovering that intent from your non volatile store. So all you have to do is:

1. make sure the intent is consistent on the part of the software writing to disk (sync + transaction journalling)
2. that the data is written in an order that makes sense (write barriers)
3. that the operation succeeds (last sata / nvme transaction was committed - hold up capacitors)

This works for more than just power failure scenarios. Fires, halon dumps, APC UPS issues, wars, dodgy fibre cables and laptop PMC's giving up before the battery is dead.

--- End quote ---

Valid. I yield to your logic.

I'm still taking with me though my rant about computers being full of fibs & lies. I'm still afraid of intel's spooky-ghost execution and ya cant make me like it!
Navigation
Message Index
Next page
Previous page
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod