Author Topic: What would cause failure of 4 new motherboards in 3 different PC's.  (Read 4208 times)

0 Members and 1 Guest are viewing this topic.

Offline Strider27Topic starter

  • Newbie
  • Posts: 6
Hello Gentlemen,

First and foremost let me prepend this by saying that I in no way have any expert or even intermediate knowledge of electronics, I am a sysadmin by trade with some experience in electronics from occasional hardware repairs and hobbies, I am able to tell what most of the more basic components are and of course can recognise most IT specific components ( south/northbridge, audio IC's, Network IC's, IO/Multi? Controller etc.), I can somewhat read and understand a schematic (although schematics are not available for either of the boards mentioned below), also have decent soldering skills and some re-work experience.

We have a strange issue, we have deployed 6 custom built PC's at 4 different locations, the PC's are identical, assembled with the parts from the same batch. At one of the sites we had the motherboards fail within the first month in all 3 PC's deployed. I was initially blaming the faulty batch of motherboards. We replaced the motherboards with MSI B150M Mortar that were available on short notice from a local supplier and they worked fine for a few months however we just had 1 of the new boards fail this week.

Every time a failure occurs over the weekend, while the pc's are off.

The power supplies are not of great quality but are acceptable, the c14 jack does not fit the C13 plug perfectly (goes in about 90%), if you moderately wiggle the cable it pops out slightly and may start arching (I don't believe it is bad enough for the manufacturer/supplier to accept an RMA, as if the cable plugged in fully and left alone it works fine). The PC's are suspended from the bottom of the desks with straps and the PSU's are located on the bottom of the cases which stick out slightly below the backboard of the desks. The desks face into the middle of the room and the cable, while secure, is still accessible. Now my only other suspicion is that the cleaning crew are catching the power cables when they hoover etc. and cause the cables to slightly pop out and arch.

All other machines at the site have been fine for the past 2 years since deployment, (although their power supplies are at the top of the case and are not easily accessible behind the backboard of the desk, the identical machines from the same batch at other locations are also fine since their deployment 6 months ago.

The PSU's are Thermaltake TR2 S 500 Watt, not great quality, however have never had any major issues with them in any of the previous builds, and they seem acceptable for office grade PC's.

The first set of failed motherboards are ASRock H110M-DGS R3.0, the failed component is the power regulator G9661M (see attached image) by the 24 pin port on the motherboard, I believe I have identified it correctly (http://www.gmt.com.tw/product/datasheet/EDS-9661.pdf) and ordered a replacement from ebay (https://www.ebay.ie/itm/10-pcs-New-G9661-25ADJF11U-G9661-25-9661-25-ic-chip/232485358342?ssPageName=STRK%3AMEBIDX%3AIT&_trksid=p2057872.m2749.l2649), won't know if anything else failed until the replacement IC arrives in like several months. The component is shorted and becomes very hot, preventing the motherboard from starting.

1 of the replacement motherboards has also failed, it is an MSI B150M Mortar, the failed component is R8015A1SB (see attached image), which I'm currently trying to identify in another thread (https://www.eevblog.com/forum/microcontrollers/identifyingreplacing-r8015a1sb/). It is also shorted gets hot and prevents the motherboard from booting.

Am I crazy to be blaming the cleaning crew nicking the cables and the loose C14 jacks, is it more likely to be caused by something else, is there anything I can do to pinpoint the cause.

Thanks in advance.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8469
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #1 on: March 01, 2018, 03:09:33 am »
That other component is actually a GS7133.

It and the G9661 are both 3A LDOs with 5V input.

That combined with the fact that the failures are occurring when off suggests they are on the 5Vsb input (you can check this by measuring continuity between the input of the LDO and pin 9 of the ATX connector), and definitely agrees with the arcing scenario --- the high frequency noise that creates could be sending spikes into the 5Vsb rail and killing connected components such as that LDO.

Measure the 5Vsb voltage of the PSUs, to make sure that they haven't been damaged by the arcing, as otherwise they could be the ones continuing to kill mobos after the cause has been fixed.

A C14/13 should not be loose. Loose connections create heat and are a fire hazard.

 
The following users thanked this post: Strider27

Offline floobydust

  • Super Contributor
  • ***
  • Posts: 7594
  • Country: ca
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #2 on: March 01, 2018, 03:26:44 am »
What do you have for ESD precautions when handling the motherboards and installing them? Usually I see ESD cause failures about 5-15 days after a system is put into service.

If you are rough installing the ATX power supply connectors, or flexing the motherboard, or its mounting spacers+screws are shorting to parts, I could see parts fail.

I have seen cleaning staff plug their vacuum cleaner into the same power bar as office PC's and cause damage. That was very hard to track down. Failures and HDD corruption every 2 business days over night, when they came to clean.
 
The following users thanked this post: Strider27

Offline Armadillo

  • Super Contributor
  • ***
  • Posts: 1725
  • Country: 00
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #3 on: March 01, 2018, 03:27:51 am »
Judging from your ability to find the faulty components then I suspect is thermodynamic cause you could have easily seen ripples and harmonics disturbances if you wanted.
You didn't elaborate on the cooling. I suspect is heat induced. Improper channeling of airflow across the board or insufficient air flow or fans not rotating enough.
Put thermal pads onto the ICs or small aluminum heatsink onto it [yes there are small heatsink].
Use a thermal camera to see where is glowing and put the heatsink or pads onto it.

Edit: And your cooling consideration should cater for the worst ambient environment for the PC to be located e.g. unconditioned factory floor, local heat sources etc.... or clearly labelled onto the PC "Ambient 25 Deg.C MAX!"
« Last Edit: March 01, 2018, 03:44:27 am by Armadillo »
 
The following users thanked this post: Strider27

Offline Rasz

  • Super Contributor
  • ***
  • Posts: 2617
  • Country: 00
    • My random blog.
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #4 on: March 01, 2018, 03:59:20 am »
ASRock H110M-DGS R3.0, the failed component is the power regulator G9661M

Asrock has been selling factory defective motherboards since 1150 socket, they all die when powered off _and_ unplugged, something to do with cmos battery/standby circuit resulting in fried standby LDO (something reverse biasing? feeding power back? or latching?).

https://www.reddit.com/r/buildapc/comments/2jrt1b/psarequest_possible_reoccurring_problem_with_the/?sort=new

I have Z87 Pro4 board with same issue. Worked great 4 years plugged in until I swapped GPUs(had to disconnect power), then started having problems booting, would boot on the second try, then on the fifth, then on the 50th, now completely dead. In most cases fault gets progressively worse until death of the board.


Maybe someone is turning off power during the weekends in that office, or cleaning lady uses power socket for the vacuum cleaner ;) Edit Ha, floobydust beat me to it! altho I meant staff unplugging main PC power strip to plug her equipment.
« Last Edit: March 01, 2018, 04:01:11 am by Rasz »
Who logs in to gdm? Not I, said the duck.
My fireplace is on fire, but in all the wrong places.
 
The following users thanked this post: Strider27, rakanishu

Offline Strider27Topic starter

  • Newbie
  • Posts: 6
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #5 on: March 01, 2018, 03:32:19 pm »
That other component is actually a GS7133.

It and the G9661 are both 3A LDOs with 5V input.

That combined with the fact that the failures are occurring when off suggests they are on the 5Vsb input (you can check this by measuring continuity between the input of the LDO and pin 9 of the ATX connector), and definitely agrees with the arcing scenario --- the high frequency noise that creates could be sending spikes into the 5Vsb rail and killing connected components such as that LDO.

Measure the 5Vsb voltage of the PSUs, to make sure that they haven't been damaged by the arcing, as otherwise they could be the ones continuing to kill mobos after the cause has been fixed.

A C14/13 should not be loose. Loose connections create heat and are a fire hazard.



Thanks, that confirms my suspicions. As Rasz states in his post, those LDO's apparently have inherent issues as is, and fluctuations are definitely not helping. The cleaning staff must be either knocking the cable around or plugging their hoover in the same socket, as these sockets face towards the middle of the room this seems to be the most likely scenario, they are the most easily accessible powerpoints to plug in. I have asked them not to use those sockets, before any of the issues ever started. I have also asked them to be careful around the area when cleaning, but I don't think they took it seriously enough. Will have a more stern chat with them.

I have tested the PSU's after failure, they seem fine, and are currently in rotation with the new motherboards.

The C14/13 isn't really that bad, but I will bite the bullet and replace the PSU's anyway, you are right, better safe than sorry.

Quick question, would I get away with replacing the GS7331 on the MSI board with the G9661M that I ordered for AsRock, looking at the datasheets, they use resistors on the motherboard to control the output voltage by using feedback/ADJ pin 7 on both LDO's, would G9661M have the same output voltage as GS7331 with the same set of resistors? I'm not knowledgeable enough to be able to tell. I can see the pinout for the SOP-8 package is the same for both, assuming that pin 7 marked as FB (feedback) in one and ADJ(adjustment?) on the other performs the same function.

What do you have for ESD precautions when handling the motherboards and installing them? Usually I see ESD cause failures about 5-15 days after a system is put into service.

If you are rough installing the ATX power supply connectors, or flexing the motherboard, or its mounting spacers+screws are shorting to parts, I could see parts fail.

I have seen cleaning staff plug their vacuum cleaner into the same power bar as office PC's and cause damage. That was very hard to track down. Failures and HDD corruption every 2 business days over night, when they came to clean.

Anti-Static mats, wristbands, and gloves(for more sensitive/expensive parts, CPU etc.), have been building/repairing PC's and servers for 14 years, last 8 years - professionally. Very unlikely to have been caused by me, just like anyone else I may make a silly mistake here or there but normally those are caught straight away, and definitely not 3 out of 6 machines on the same day. I had maybe 2-3 minor hiccups due to my fault (not paying attention) in the last 14 years. The current lab has been used for the past 3 years in it's current form and have seen close to a thousand machines go through it, so very unlikely to be environmental. Sorry don't mean to come off standoff-ish, just clarifying your questions.

Man, that must have been annoying. I know the feeling, trying to explain this to people not familiar with IT/Electronics is fairly annoying, they think you are being an ass and trying to seem "smart" and looking for things to "annoy" them about. I will certainly have a more stern chat with them and might even arrange a demonstration just to drive it home.

Judging from your ability to find the faulty components then I suspect is thermodynamic cause you could have easily seen ripples and harmonics disturbances if you wanted.
You didn't elaborate on the cooling. I suspect is heat induced. Improper channeling of airflow across the board or insufficient air flow or fans not rotating enough.
Put thermal pads onto the ICs or small aluminum heatsink onto it [yes there are small heatsink].
Use a thermal camera to see where is glowing and put the heatsink or pads onto it.

Edit: And your cooling consideration should cater for the worst ambient environment for the PC to be located e.g. unconditioned factory floor, local heat sources etc.... or clearly labelled onto the PC "Ambient 25 Deg.C MAX!"

Thanks for the input, my ability to troubleshoot electronic issues isn't really something to write home about, I got lucky that in both cases the components are shorted and are getting fairly hot, I found them by hand, looking for hot spots, while plugged in. If I had to find them with a multimeter, without a schematic available it would be another story. Just to confirm the gaps in my knowledge about electronics: "ripples and harmonics disturbances" makes no sense to me, I'm assuming you mean fluctuations/surges in the grid?

With regards to airflow, it is very unlikely, I have a considerable amount of experience building PC's, and familiar with airflow best practices, there is a fan blowing right over the component, in both cases, the ambient temperature does not go above 27 C in the summer, the machines are monitored when on, none of the monitored temperatures ever go high enough to be a cause of concern. I am more inclined to go with my initial suspicion and the comment from Amyk.

Once the replacement LDO's arrive, I will seat some heat sinks with thermal glue onto them, as you suggested. I have a bunch luying around the office and was looking for an excuse to use them.

I will also invest in a thermal camera just to ease troubleshooting and add another step to QA when building, by checking the airflow visually, I really should have done it a long time ago.

ASRock H110M-DGS R3.0, the failed component is the power regulator G9661M

Asrock has been selling factory defective motherboards since 1150 socket, they all die when powered off _and_ unplugged, something to do with cmos battery/standby circuit resulting in fried standby LDO (something reverse biasing? feeding power back? or latching?).

https://www.reddit.com/r/buildapc/comments/2jrt1b/psarequest_possible_reoccurring_problem_with_the/?sort=new

I have Z87 Pro4 board with same issue. Worked great 4 years plugged in until I swapped GPUs(had to disconnect power), then started having problems booting, would boot on the second try, then on the fifth, then on the 50th, now completely dead. In most cases fault gets progressively worse until death of the board.


Maybe someone is turning off power during the weekends in that office, or cleaning lady uses power socket for the vacuum cleaner ;) Edit Ha, floobydust beat me to it! altho I meant staff unplugging main PC power strip to plug her equipment.



Thanks for your post, the inherent issues with the LDO would certainly not help matters. I do believe it is the cleaning crew either plugging into the same socket or messing around with the power cable, too many things point to that.

Wow, my first guess with your Z87 Pro4 issue would have been leaking caps in the PSU or on the board itself, as in my experience they exhibit the exact symptoms you described, but I'm assuming you have done your research/testing on the matter.
 

Offline Armadillo

  • Super Contributor
  • ***
  • Posts: 1725
  • Country: 00
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #6 on: March 01, 2018, 04:16:23 pm »
The attached response from a TI employee could shed some light into the "suspected" failure mode.

Additionally Mobos and some graphics cards are not designed the same hence I observed that some LDOs are a tack hotter than others, particularly damaging at the instant when Vin diminishes causing the series resistance to increase that could cause perilous thermal damage with the "large" back flow current. In such cases, the usual observation of airflow would not be easier than a thermal camera.

As TI employee noted, a series diode at Vin could be a solution.

There are many designs and design problems out there and so the above is not the sole representative of all the other failure modes available.   :)

 
The following users thanked this post: Strider27

Offline drussell

  • Super Contributor
  • ***
  • Posts: 1855
  • Country: ca
  • Hardcore Geek
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #7 on: March 01, 2018, 08:25:43 pm »
^^^   Uhhh...  ASRock and Micro-Star?  :)

Seriously, though... that IS still kinda strange, all around, since you would think that Thermaltake would make a decent enough supply but, I suppose since everything from pretty much every brand has been gutted to apparently become cost-reduced cheap Chinese junk it should really be no surprise.  :palm:
 

Offline Strider27Topic starter

  • Newbie
  • Posts: 6
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #8 on: March 01, 2018, 09:35:34 pm »
^^^   Uhhh...  ASRock and Micro-Star?  :)

Seriously, though... that IS still kinda strange, all around, since you would think that Thermaltake would make a decent enough supply but, I suppose since everything from pretty much every brand has been gutted to apparently become cost-reduced cheap Chinese junk it should really be no surprise.  :palm:

Clients are always on a budget, so have to compromise, AsRock, MSI, Thermaltake, Kingston or WD Green SSD's etc. Still better than pre-built machines of similar spec (and higher price) from the likes of Dell, HP Etc. (with the exception of some higher end workstations and servers of course)
But you are right, these days most of the components are the same "cut-corners-to-save-costs" components with different labels. You know something is not quite right when the PSU's start weighing less than the cables for the said PSU. :(
 

Offline picafra

  • Newbie
  • !
  • Posts: 5
  • Country: it
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #9 on: November 17, 2020, 01:50:34 pm »
Hallo friend, I have a similar problem.
I have HP Prodesk G1 MT  PC desktop. The PC does not power on. The green led on motherboar is on and I noticed that LDO regulator GS7133 overheats, I replaced it, but the pc does not start.
I would like to know if you have solved the problem.
Thank you
 

Offline Strider27Topic starter

  • Newbie
  • Posts: 6
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #10 on: November 17, 2020, 02:14:37 pm »
Hallo friend, I have a similar problem.
I have HP Prodesk G1 MT  PC desktop. The PC does not power on. The green led on motherboar is on and I noticed that LDO regulator GS7133 overheats, I replaced it, but the pc does not start.
I would like to know if you have solved the problem.
Thank you

Hello, replacing LDO's did not resolve the issue with already failed motherboards. Due to time constraints and the low cost of motherboards, left them for "later" for further diagnostics, and have not touched them since, they are still on the shelf somewhere and unlikely to be looked at :)
After having a stern talk with the cleaning crew and making sure they use a PAT tested Vacuum Cleaner supplied by the client AND making sure they not use the power sockets on the same circuit/fuse as the computers, no new fried boards thankfully.
 
The following users thanked this post: bd139, I wanted a rude username

Offline picafra

  • Newbie
  • !
  • Posts: 5
  • Country: it
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #11 on: November 17, 2020, 04:29:53 pm »
Hello friend. Thanks for your availability and information.
 

Offline perieanuo

  • Frequent Contributor
  • **
  • Posts: 914
  • Country: fr
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #12 on: November 18, 2020, 09:03:41 am »
hi,
you got either a batch of bad psu's or most likely a bad AC power supply.I recently installed some monitoring relay and surprise surprise I got over 250Vca spikes.
Of course the electricity suplier vows it's all fine on their side. My arguments that the supervisor relay is validated with 3 calibrated multimeters did no impression on them.
So, what can we do in this case?
My solution was to implement the supervisor voltage relay combined with a sort of contactor in order to correctly 'cut' the 230Vca in unver or overvoltage situation AND install good quality (avr-type) backup power supply.
I don't believe the AC voltage regulators provided with some backup PS's or stand-alone ones are quick enough to eat spikes, that's why the supervisor relay with his attached contactor is in place.

Well, bon courage with diagnosing your pb (alternatively before buy stuff you can just monitor your 230Vca supply if you can with dedicated logger)
 

Offline ogden

  • Super Contributor
  • ***
  • Posts: 3731
  • Country: lv
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #13 on: November 18, 2020, 10:04:40 am »
making sure they use a PAT tested Vacuum Cleaner supplied by the client AND making sure they not use the power sockets on the same circuit/fuse as the computers, no new fried boards thankfully.

Right. ESD shoes too. Violent ESD-zapping of unearthed PC case is one of possible 5vsb failure modes. You shall check that PCs are grounded/earthed (then let us know:). If no grounding/earthing then consider replacing PSUs of affected PCs because isolation of 5Vsb supply and/or Y-capacitors may not be there anymore. I would blame bad cheap PSU design here, not motherboard.
 

Offline picafra

  • Newbie
  • !
  • Posts: 5
  • Country: it
Re: What would cause failure of 4 new motherboards in 3 different PC's.
« Reply #14 on: November 19, 2020, 10:44:20 am »
Hi, Thanks for the post.
In relation to the LDO GS7133 regulator that overheats, what voltage must the vout pin(6) have in the stand by position (5 V?).
Sorry for my not perfect english language.
Thanks in advance
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf