Author Topic: Help me to find a ATX Main board with ECC that is not for gaming + tower case  (Read 4218 times)

0 Members and 1 Guest are viewing this topic.

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
I don't know what happens with DDR5

Well it should be easy since finally the ECC will be integrated by default in the DDR5 protocol AFAIK.
In other words all DDR5 are ECC.

It makes kind of sense, since the speed is so high you need some error correction by default.

The above is just some youtube info I got, I could be wrong since I did not read all the DDR5 protocol tech specs...
« Last Edit: March 30, 2023, 06:14:07 pm by Zucca »
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
I don't know what happens with DDR5

Well it should be easy since finally the ECC will be integrated by default in the DDR5 protocol AFAIK.
In other words all DDR5 are ECC.

It makes kind of sense, since the speed is so high you need some error correction by default.

The above is just some youtube info I got, I could be wrong since I did not read all the DDR5 protocol tech specs...
DDR5 has on chip ECC but it's totally internal, memory bus must have it's own ECC the same as previously.
 

Offline Twoflower

  • Frequent Contributor
  • **
  • Posts: 737
  • Country: de
I'm not sure if all DDR5 DIMMs are ECC. If I got that right the on-die ECC that DDR5 specifies has nothing to do with the ECC known from DDR4 and earlier releases.

If understand that right the on-die ECC corrects errors that happen within the memory array on the die itself. The data transmission from and to the CPU are not covered by that. For this there are still DDR5-ECC DIMMs available that has more RAM and additional data-lines the 'conventional' ECC needs.

The big question is what is the biggest source of bit-flips. It is possible that the memory array itself is the biggest source here the on-die ECC could help. Also the success of attacks like Rowhammer might reduced by that.
« Last Edit: March 30, 2023, 06:32:03 pm by Twoflower »
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 5907
  • Country: es
ECC RAM? Did nerd gamers ran out of hype?
I have some money, what else can I buy with it?

I wonder how 99.999% of people have survived without ECC all these years.
Nevermind - To overclock it 5% more without crashing, gaining 0.001% actual framerate increase  ::)
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
ECC RAM? Did nerd gamers ran out of hype?
I have some money, what else can I buy with it?

I wonder how 99.999% of people have survived without ECC all these years.
Nevermind - To overclock it 5% more without crashing, gaining 0.001% actual framerate increase  ::)
Easily, with occasional bit flips, and small data corruption somewhere, usually unnoticed. With current amounts of RAM it's not a question if it will happen but how often it will happen. Especially considered that most of consumer RAM is factory overclocked and runs faster than IC manufacturer guarantees.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8646
  • Country: gb
ECC RAM? Did nerd gamers ran out of hype?
I have some money, what else can I buy with it?

I wonder how 99.999% of people have survived without ECC all these years.
Nevermind - To overclock it 5% more without crashing, gaining 0.001% actual framerate increase  ::)
About 20% of computers are servers, and they have always used some kind of memory protection. First parity to detect errors, then ECC to work through the errors. Most domestic PCs used to have parity as standard in the 80s. Then it was gradually dropped, as memory became more reliable. Quirky memory problems are not unusual. Many a motherboard gives some small percentage of memory errors when you put more than one memory stick on the same memory bus. Its not at all unusual to see things like memtest86 flush out some obscure memory issues.
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
The problem is that it's not only problematic RAM, CPU or motherboard. There ace cosmic rays which will occasionally flip a bit even if there are no hardware problems whatsoever. In the past I had enough of RAM problems which drove me nuts and which no RAM test would detect. The worst part was that I could not figure out where the issue was without starting randomly changing the hardware and testing for months to catch that nasty once in two-four weeks BSOD. And it happened to me several times. So in the end I decided a big no to my main PC without ECC.
 
The following users thanked this post: Zucca

Offline nightfire

  • Frequent Contributor
  • **
  • Posts: 585
  • Country: de
This Sh*tshow with AMD CPUs and some chipssets I had with different mainboards and graphics cards. At work, I built some rigs with the mentioned ASUS x570 WS ACE mainboard for our software developers, and with a Nvidia P400 it worked mostly fine. My own Gigabyte board (budget thing because I did not need more comfort) and an ASUS Prime series board exhibited lots of initial problems which were caused by initial init of the graphics card- putting in a different card, got them to boot, and after that everything was fine.
So probably a BIOS problem in some combinations.

At work, I have about 6 systems with  that board, 5 with 3700X CPU, one with 5950- all running stable and our devs use VMs with VMWare on them. In our setup I found them to be reliable.

For cases, I really liked the FRactal Design 7 compact- nice if you do not need to place traditional 3,5" HDDs in it. Big enough to allow for water cooling.
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 5907
  • Country: es
I know what ECC is lol, cosmic rays flipping a bit is an extremely rare event.
In the 0.00000000001% chance of it happening you'll get a BSOD, the system will reboot and you'll resume gaming with your RGB hype, it's not a mainframe/server that should stay on at all costs, where a single minute off means $$$ fleeing away.
« Last Edit: March 30, 2023, 10:29:28 pm by DavidAlfa »
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14472
  • Country: fr
I know what ECC is lol, cosmic rays flipping a bit is an extremely rare event.
In the 0.00000000001% chance of it happening you'll get a BSOD, the system will reboot and you'll resume gaming with your RGB hype, it's not a mainframe/server that should stay on at all costs, where a single minute off means $$$ fleeing away.

The OP explicitely said it was not for gaming.

Sure ECC may be overkill even for workstation use. My main workstation has 64GB of std DDR4 and can run for several weeks without a reboot. Actually I've never had it blue screen except when I was testing a buggy custom driver.

But funny thing is, in particular if you buy it second-hand, DDR4 ECC for a given capacity may not be more expensive than std DDR4 - at least if you compare ECC with std DDR4 than can be overclocked (thus with XMP profiles and usually large heatsinks.) There's a fricking lot of available second-hand DDR4 ECC out there from decommissioned servers.

I don't think you can use ECC RAM with CPUs that don't support it. I know Intel stuff better than current AMD stuff, but I doubt you can use ECC RAM with "consumer" AMD CPUs. That's not just a matter of the motherboard itself.
So I would expect to need at least a EPYC CPU or maybe? Threadripper Pro.
Let me know if I'm wrong. I'm curious.
Yes you are wrong. Did you read my post? I run Ryzen 3700X with ECC, and previously did the same with 1700X. ECC works and I can see WHEA corrected memory errors in event viewer if overclock it too much. Every Ryzen CPU supports ECC except APU type. Both intel and AMD use the same silicon for consumer and server market. Intel disables ECC on consumer CPUs (except some very low end ones), AMD doesn't.

Your post was not very explicit and didn't mention which CPU it was, so uh. But as I said, it was more of a question than an assertion, as I don't know AMD CPUs well.
It's interesting to know that ECC is supported on their consumer Ryzen CPUs, even though it's not documented - at least I couldn't find any official AMD info about that. So they could very well decide to disable this in more recent CPUs without any notice.
Yes, Intel disables ECC support on their consumer CPUs and only enables it (along with support of much larger RAM capacities) on their Xeon CPUs, which I'm more familiar with. Intel consumer CPUs rarely support more than 128GB of RAM, while many Xeon ones support > 1TB RAM. And of course ECC.

Quote
Intel disables ECC on consumer CPUs (except some very low end ones),

Not sure which low end ones that would be? Do you have examples?
« Last Edit: March 30, 2023, 10:49:23 pm by SiliconWizard »
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
The worst two shitshows I had were with ASUS PRIME X370-PRO and Asrock Fatal1ty X370 Gaming K4 I had a few years ago. My PC at the time had that ASUS and Ryzen 1700X and that Asrock laying around. I used Asrock first but IIRC it had some weird issue with image sometimes not appearing with two of my monitors connected, so replaced it with ASUS. I figured out I could put together a PC with my 1700X for my sister who lives in UK when she was visiting Latvia and get 3700X about two weeks later when there is supply of them available as they just came out. So first I updated Asus board with latest bios which was the first version to support Ryzen 3000 series to avoid any issues later. Then put CPU into Asrock board and build up PC from some used and new parts. Decided to use an old OCZ CPU cooler (Had it since AM2 socket but it's compatible with AM4) but it had some slight sleeve bearing noise which I was not comfortable with. So I decided to get a new Sunon non-PC fan from nearby electronic component shop with tach output but no PWM in, which was fine since motherboard can control CPU fan voltage too, I just needed to crimp a proper connector. So I did that, gave ECC RAM some overclock (previously got more than I need eBay), set everything up and it worked just fine. So then move it around and then post fails (mobo has post code indicator). I take out battery to reset reset the bios, everything boots. But then I figured out If I disconnect power, it post will fail but it works if I shut it down until first power disconnect. If I don't change bios settings, it boots after disconnecting power. Change them, it does not. Wasted ton of hours during a few days on that garbage. Then I already needed to drive to another town to my parents whom my sister visited, so I brought it as is thinking I'll figure out bios settings there. Lo and behold, the issue was effing fan setting. If I set it to analog mode, mobo hangs during post after removing power  |O :palm: :wtf:. Bios downgrade did not help either.  So to keep fan speed control (was a bit too lout without it) I ended up ordering a whole brand new cooler with PWM fan to get it fast while I was still there. Just in case checked and it had the same issue with analog control setting with proper PWM fan used. Great effing job Asrock.
  But my suffering was far from the end. So after some time I got back at home, got 3700X, put it in ASUS mobo, and it immediately booted just fine. Newer bios just came out but I did not update it yet. Changed some bios settings, maybe RAM speed, don't remember. And this garbage never booted again. I took out the battery for hours, used cmos reset jumper, different PSU, GPU, nothing helped. As a cherry on top this board does not support bios flashback. I felt something really fishy, so ordered cheapest Athlon AM4 CPU for like EUR 35, put it in and this bastard booted just fine. Updated bios to the latest version, put 3700X back and  this garbage again booted just fine.  :horse:
« Last Edit: March 30, 2023, 11:37:09 pm by wraper »
 
The following users thanked this post: Zucca

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
But funny thing is, in particular if you buy it second-hand, DDR4 ECC for a given capacity may not be more expensive than std DDR4 - at least if you compare ECC with std DDR4 than can be overclocked (thus with XMP profiles and usually large heatsinks.) There's a fricking lot of available second-hand DDR4 ECC out there from decommissioned servers.
Registered ECC RAM used in servers does not work with desktop CPUs. You need unregistered ECC which is much more scarce and not that cheap even used. It works with desktop intel and AMD motherboards that don't support it too, you just don't get ECC functionality. BTW at least some old intel consumer motherboards (like X79 with LGA2011 socket) support registered ECC RAM, but you need to put Xeon in them for it to work.
« Last Edit: March 30, 2023, 11:15:36 pm by wraper »
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
COM Port: oh yes! very nice to have

Modern mobos especially for workstation/desktop no longer carry that on board.

I used this below PCIE x1 to 1 parallel + 2 serial ports WCH382L based card (there so many other variants too), and it just works. The driver for Windows 10 x64 and Linux x64 downloaded from the manufacturer's web works flawlessly, and its dirt cheap. Oh, the parallel and serial ports all are full fledged featured COM and LPT port like the good olde PC days eg: COM with crazy 8Mbps baud rate  :scared: , and LPT port can be set like various mode : EPP , ECP and etc.

Details => http://www.wch-ic.com/products/CH382.html

Click image to enlarge
« Last Edit: March 31, 2023, 03:26:17 am by BravoV »
 
The following users thanked this post: Zucca

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
As a desktop Ryzen ECC RAM user, I did make a thread almost 3 years ago -> Do you feel sinful overclocking an ECC RAM ?  :P

All I can say is I am happy, content and satisfied with my decision, and what most important for me is "a peace of mind" ... as I can afford it.  >:D

As wrapper mentioned, for desktop AMD Ryzen, it must used unregistered ECC RAM, becareful as they are totally different animal compared to the popular former server's RAM "registered" ECC.

Mine below, Samsung B-Die  :P ECC unregistered used at my current Ryzen, and to trigger the ECC error "deliberately" , (not easy though as I needed many tries & errorr for the OS to capture the corrected error or its totally crashed it), I was using undervolted RAM wildly to provoke the error. Yes, my B-Die is a great overclockers even though I don't overclock it.  >:D


 
The following users thanked this post: Zucca

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 5907
  • Country: es
My bad, I completely misread the topic, thought it was another RGB fanboy seeking new ways of wasting money.
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
The problem is that it's not only problematic RAM, CPU or motherboard. There ace cosmic rays which will occasionally flip a bit even if there are no hardware problems whatsoever. In the past I had enough of RAM problems which drove me nuts and which no RAM test would detect. The worst part was that I could not figure out where the issue was without starting randomly changing the hardware and testing for months to catch that nasty once in two-four weeks BSOD. And it happened to me several times. So in the end I decided a big no to my main PC without ECC.

spot on on everything!
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
Details => http://www.wch-ic.com/products/CH382.html

Thanks the  ASUS Pro WS X570-ACE looks like it has a serial. Good to know about that board!
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
this garbage again booted just fine.  :horse:

Thanks for sharing, this push me even further in direction GIGABYTE or MSI.
Too bad that SuperMicro does not have AMD ATX MB AFAIK.
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 

Offline nightfire

  • Frequent Contributor
  • **
  • Posts: 585
  • Country: de
Anecdote: Last employer was a datacenter where we had in around 2003/2004 some SUN Sparc E4500 in operation. 14 CPU, 14GB memory, meaning lots of modules.
And when statistics said that sunspot activity was on high (some colleagues monitored websites measuring sun activity), the amount of  correctable ECC errors went up.

I agree, that overall quality of memory modules went up in the past two decades, but another trend is contering it, as stated- factory "overclocking".
Its somewhere between moments where simply the spec does not specify certain setups, but technology easily allows, and manufacturers really stretching consumer goods to the max.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14472
  • Country: fr
BTW at least some old intel consumer motherboards (like X79 with LGA2011 socket) support registered ECC RAM, but you need to put Xeon in them for it to work.

X99 too. And with Intel CPUs you need Xeons anyway to use ECC RAM.

As I said, I knew nothing about AMD supporting ECC RAM on their "consumer" CPUs, and have learned here that they required unregistered ECC. So that's good to know in order to avoid wasting money.

I have otherwise recycled a X99-based workstation with a Xeon CPU and registered ECC DDR4, and that turned out cheaper than expected.
 

Offline wraper

  • Supporter
  • ****
  • Posts: 16864
  • Country: lv
Mine below, Samsung B-Die  :P ECC unregistered used at my current Ryzen, and to trigger the ECC error "deliberately" , (not easy though as I needed many tries & errorr for the OS to capture the corrected error or its totally crashed it), I was using undervolted RAM wildly to provoke the error. Yes, my B-Die is a great overclockers even though I don't overclock it.  >:D
I use a pair of shitty old 2133MHz Hynix MFR (16GB Hynix branded sticks made in 2016) but I bought like 6 of those for cheap some years ago. It's well known for poor overclock. It totally sucked to overclock with Ryzen 1700X and X370 motherboards. Especially on Asrock X370, it would not overclock at all without adjusting drive strength. Was a bit better on on ASUS X370. Later on the same ASUS X370 and Ryzen 3700X it clocked better and could be run something like 2933 CL19 stable. On current Gigabyte B550 PRO-P and same 3700X it surprisingly runs 3200MHz at it's stock CL15 timings by just adjusting RAM clock multiplier and voltage to 1.35V.
 

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 16615
  • Country: us
  • DavidH
Well it should be easy since finally the ECC will be integrated by default in the DDR5 protocol AFAIK.
In other words all DDR5 are ECC.

It makes kind of sense, since the speed is so high you need some error correction by default.

The above is just some youtube info I got, I could be wrong since I did not read all the DDR5 protocol tech specs...

I did study the DDR5 specifications just to resolve this question.

All DDR5 implements limited ECC internally as part of the bus interface and *not* the sense amplifiers.  Data is corrected in the bus interface, but is *not* automatically written back to the DRAM array, so no scrubbing takes place.

However DDR5 with a conventional ECC configuration does exist, with the 32 bit wide memory channel being expanded to 40 bits.  Normal DDR5 has two 32 bit channels, and ECC DDR5 has two 40 bit channels.
 
The following users thanked this post: Zucca, SiliconWizard

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
However DDR5 with a conventional ECC configuration does exist, with the 32 bit wide memory channel being expanded to 40 bits.  Normal DDR5 has two 32 bit channels, and ECC DDR5 has two 40 bit channels.

So it means, those internal ECC is just not "good enough", hence the creation of 40bit version.

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
Beating the dead horse  :horse:

Tomorrow I will try to update the BIOS over the ACCE ETH.

Holy water from Lourdes is already prepared.

Did not work, the client board needs to have an agent already deployed to throw a BIOS over the ETH  :-DD  :horse:

More details...
https://forums.servethehome.com/index.php?threads/asus-pro-ws-x570-ace-bios-flashback.31319/#post-296223

Anyway now I am pretty convinced it is a old BIOS topic, I ordered a AMD Ryzen 3 3200G on Amazon that I will return 5 minutes later after the BIOS is upgraded.

BTW ASUS is not even trying, I would have put at least a big * on everywhere for the not supported newer CPUs

Quote
Pro WS X570-ACE

AMD AM4 socket: Ready for AMD Ryzen™ 5000 Series*

*Ready it does not mean it will work, you will need to upgrade the BIOS. By the way with a newer processor installed on this board, the Pro WS X570-ACE will not boot and you will not able to upgrade the BIOS. You could use our USB BIOS FlashBack® technology but this board it not supporting it. This means you are stuck with a not booting brick.
Yes we at ASUS are certified monkeys and do not know how to sell motherboards, we of course don't care if our customers will be grounded in front of a black screen with a newer processor on our board. We simply don't want to have more customer in the future. ASUS wants to decrease its income because designing motherboards and selling them is a boring a frustrating process. It sucks.
ASUS just wants to file for bankruptcy as faster as possible, and to not provide a salary to anyone anymore. After that the ASUS employees will finally just stay in the jungle and eat bananas all day long.
Please support us and avoid buying ASUS products. We don't want your money, if you agree with our future company vision please send us bananas.
If there are still money left will will pay for them, if not please consider a donation, of bananas of course.



« Last Edit: April 01, 2023, 03:44:26 am by Zucca »
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 

Offline ZuccaTopic starter

  • Supporter
  • ****
  • Posts: 4308
  • Country: it
  • EE meid in Itali
All DDR5 implements limited ECC internally as part of the bus interface and *not* the sense amplifiers.  Data is corrected in the bus interface, but is *not* automatically written back to the DRAM array, so no scrubbing takes place.

Will then the outside world get notified it such a correction on the bus interface happened?
Can't know what you don't love. St. Augustine
Can't love what you don't know. Zucca
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf