Author Topic: what's your recent fail?  (Read 7918 times)

0 Members and 1 Guest are viewing this topic.

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: what's your recent fail?
« Reply #50 on: May 02, 2021, 08:28:42 pm »
BTW: an interesting fail (not from my side) but I still think it is good to share: About a decade ago I have designed a PCB for a customer with a SoC and some DDR memory. For today's standards it is not a high density design. This board has always produced nicely without low numbers of failed boards but the customer changed to a different assembler and the number of failed boards increased massively. A lot of effort went into finding why the board failed and it turned out that when the DDR memory is cold a significant number of boards won't start. In the end the board itself remained the only suspect and it turned out the circuit board manufacturer had made the traces wider in order to produce it on a production process not quite suitable for the board. Probably some of the length tuned serpentine traces have a short between them creating a stub or timing error. The same components on a board made with a better etching process have no problems.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline rsjsouza

  • Super Contributor
  • ***
  • Posts: 5986
  • Country: us
  • Eternally curious
    • Vbe - vídeo blog eletrônico
Re: what's your recent fail?
« Reply #51 on: May 02, 2021, 10:51:31 pm »
Many decades ago on an internship I worked on a prototype board (an ISA PC card) that had two voltage voltage regulators (7805, 7812) vertically mounted (not my design).

Along the process of debugging the board and its firmware, I had to remove and put the board back a few times a day. Due to space constraints, this board could only be placed beside another board that controlled the Data I/O 8051 EPROM programmer.

One day in a distraction I put our board on its slot but did not realize the two voltage regulators were touching the other board... A silent but deadly zap took the EPROM programmer down. Needless to say that the design was changed and the regulators were put horizontally...
Vbe - vídeo blog eletrônico http://videos.vbeletronico.com

Oh, the "whys" of the datasheets... The information is there not to be an axiomatic truth, but instead each speck of data must be slowly inhaled while carefully performing a deep search inside oneself to find the true metaphysical sense...
 

Offline exeTopic starter

  • Supporter
  • ****
  • Posts: 2562
  • Country: nl
  • self-educated hobbyist
Re: what's your recent fail?
« Reply #52 on: May 03, 2021, 06:29:33 am »
After several days of randomly changing various things with no positive result, I was inspired to leave it powered up, and after several more days it POSTed.  Now it reliably POSTs, so far.

That's weird but... When I worked in a datacenter I had similar issues with motherboards that at first appeared to be dead. Like, I remember after a power outage I had to repair a very old server, in a crudely-made enclosure (replace and hdd or something like that). I dropped a screw on the motherboard. The server shut down itself immediately and denied to start again. Worse yet, I didn't have spare parts for old piece of junk it was. Lucky me, somehow it started after 15 minutes after tens of attempts to boot it.
 

Offline KE5FX

  • Super Contributor
  • ***
  • Posts: 1889
  • Country: us
    • KE5FX.COM
Re: what's your recent fail?
« Reply #53 on: May 03, 2021, 06:54:34 am »
BTW: an interesting fail (not from my side) but I still think it is good to share: About a decade ago I have designed a PCB for a customer with a SoC and some DDR memory. For today's standards it is not a high density design. This board has always produced nicely without low numbers of failed boards but the customer changed to a different assembler and the number of failed boards increased massively. A lot of effort went into finding why the board failed and it turned out that when the DDR memory is cold a significant number of boards won't start. In the end the board itself remained the only suspect and it turned out the circuit board manufacturer had made the traces wider in order to produce it on a production process not quite suitable for the board. Probably some of the length tuned serpentine traces have a short between them creating a stub or timing error. The same components on a board made with a better etching process have no problems.

Did you specify the stackup, or go with the fab's default?  It may have changed.  Some of the Chinese fabs have been using a single layer of prepreg between the L1 and L2 copper, which is obviously like putting a capacitor in parallel with every node on those layers.  Makes for nice skinny 50-ohm traces, at least until they reach a component pad...
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: what's your recent fail?
« Reply #54 on: May 03, 2021, 07:42:57 am »
BTW: an interesting fail (not from my side) but I still think it is good to share: About a decade ago I have designed a PCB for a customer with a SoC and some DDR memory. For today's standards it is not a high density design. This board has always produced nicely without low numbers of failed boards but the customer changed to a different assembler and the number of failed boards increased massively. A lot of effort went into finding why the board failed and it turned out that when the DDR memory is cold a significant number of boards won't start. In the end the board itself remained the only suspect and it turned out the circuit board manufacturer had made the traces wider in order to produce it on a production process not quite suitable for the board. Probably some of the length tuned serpentine traces have a short between them creating a stub or timing error. The same components on a board made with a better etching process have no problems.

Did you specify the stackup, or go with the fab's default?  It may have changed.  Some of the Chinese fabs have been using a single layer of prepreg between the L1 and L2 copper, which is obviously like putting a capacitor in parallel with every node on those layers.  Makes for nice skinny 50-ohm traces, at least until they reach a component pad...
The stackup is not very critical for this design but that wasn't the problem; the stackup was the same.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline mindcrime

  • Supporter
  • ****
  • Posts: 394
  • Country: us
Re: what's your recent fail?
« Reply #55 on: May 03, 2021, 01:10:34 pm »
mindcrime, if you don't mind a suggestion from an old German guy who went to uni in Konrad-Zuse-Strasse:
No need for float, if you can make sure that your integer accumulator is large enough.
Make sure to tell your MCU in which order to process the numbers. If you don't trust brackets and precedence, do:

int16_t accu; // (adequate till 2^15 / 9 )
accu = maxtemp.readCurrentTemperature();
accu *= 9;
accu /= 5;
accu += 32;
Serial.println( accu );

or:

Serial.println( (int)(( maxtemp.readCurrentTemperature() * (9./5) ) + 32));

Don't mind at all! This, and the ensuing discussion, has been enlightening for me. I've learned a new thing or two.

That said, in the actual code that will run on the oven controller, there won't be any need for this conversion stuff at all. Everything will be done in Celsius. This code was only a convenience for me so I could easily "eyeball" the numbers from the thermocouple and see if they looked right or not.

 
The following users thanked this post: harerod

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 449
  • Country: de
  • ee - digital & analog
    • My services:
Re: what's your recent fail?
« Reply #56 on: May 04, 2021, 06:00:55 am »
mindcrime, thank you for your kind words. I kept my comment es short as possibly, as not to scare you off. Integer arithmetic is a huge and interesting topic, arguably worth a thread in the Microcontroller section.

One huge advantage of integers over floating point hasn't been mentioned yet:
The precision of integer arithmetics can be predicted more easily than that of floats. Automatic mantissa/exponent adjustment might be a good starting point for research.
This all leads to the classic fail, where a newbie compares two floats for identity to exit a loop...

T3sl4co1l, I like your extension to my remark.
I love the reference to 2.048V references even better, knowing that an ADR420A will come with 2.045..2.051V initial output voltage. In that context a REF2920 2.007..2.048..2.089V becomes outright hilarious, especially knowing that many designers won't read past the marketing text on page 1 of the datasheet.

Regarding accu size on integer machines - does anybody remember the DSP56K series? 24bit registers, leaving a lot of leeway for operations on 16bit audio. That was the bee's knees in 1990's embedded audio processing. I remember doing active noise cancelling with those. Taking DSP56K as a reference, doing DSP on a ARM Mx feels like a dream.
 

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 449
  • Country: de
  • ee - digital & analog
    • My services:
Re: what's your recent fail?
« Reply #57 on: May 04, 2021, 06:14:21 am »
nctnico, regarding your PCB fail: I would be curious as how you followed up in that PCB issue. Stealthily "optimizing" designs is an absolute showstopper in my book. One of my customers recently found out that a manufacturer had adjusted my layout (fully specified PCB), without giving feedback. That was for a medical device, which carries medical EMI and safety certificates plus ETSI. I can pat myself on the shoulder that the design is robust enough to keep functioning, but this isn't the point. The point is that a manufacturer didn't produce what was ordered and thus potentially endangered patients.
 

Offline Humanoid

  • Regular Contributor
  • *
  • Posts: 88
  • Country: us
Re: what's your recent fail?
« Reply #58 on: May 05, 2021, 11:29:42 pm »
Small fail recently: Opened up a piece of used audio gear and was a little too forceful with removing a connector and tore the wires out of the solder points.  >:D The wires were frayed so I had to replace them anyway and the connector was unscathed, so it wasn't a big deal.

Also snapped a pin off a battery holder, but it was a flimsy piece of #%@^@& so I'm not hurt by it :P

 
The following users thanked this post: Ed.Kloonk

Offline twospoons

  • Regular Contributor
  • *
  • Posts: 228
  • Country: nz
Re: what's your recent fail?
« Reply #59 on: May 06, 2021, 03:41:26 am »
SPI - forgot that MOSI and MISO need to be crossed over (not MOSI to MOSI and MISO to MISO !) . So, another board revision needed. Stupid mistake.
 

Offline exeTopic starter

  • Supporter
  • ****
  • Posts: 2562
  • Country: nl
  • self-educated hobbyist
Re: what's your recent fail?
« Reply #60 on: May 06, 2021, 11:32:45 am »
SPI - forgot that MOSI and MISO need to be crossed over (not MOSI to MOSI and MISO to MISO !) . So, another board revision needed. Stupid mistake.

I'd say MOSI of master connects to MOSI of the slave, no? So, it's MOSI to MOSI. MOSI = "master out, slave in"
 

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 449
  • Country: de
  • ee - digital & analog
    • My services:
Re: what's your recent fail?
« Reply #61 on: May 06, 2021, 11:40:13 am »
SPI:
1) at a slave the pins are usually called something like Serial Data Out / Serial Data In. That makes the whole setup much clearer.

2) in a situation like this one really starts to appreciate serial terminators

Coming from a guy who managed to confuse RX/TX on one interface on some board with 5 asynchronous interfaces and 1 SPI. Single FAILure on a PCB with 600 components...
 

Online T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21672
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: what's your recent fail?
« Reply #62 on: May 06, 2021, 02:38:25 pm »
I much greatly prefer the MOSI/MISO terminology to RX/TX.  It's a shame it's not standard for async.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 
The following users thanked this post: artag, KE5FX

Offline Refrigerator

  • Super Contributor
  • ***
  • Posts: 1542
  • Country: lt
Re: what's your recent fail?
« Reply #63 on: May 06, 2021, 03:19:36 pm »
2AM me forgot to tidy up the silkscreen before i sent my files to the fab.
I have a blog at http://brimmingideas.blogspot.com/ . Now less empty than ever before !
An expert of making MOSFETs explode.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: what's your recent fail?
« Reply #64 on: May 06, 2021, 03:47:39 pm »
nctnico, regarding your PCB fail: I would be curious as how you followed up in that PCB issue. Stealthily "optimizing" designs is an absolute showstopper in my book. One of my customers recently found out that a manufacturer had adjusted my layout (fully specified PCB), without giving feedback.
Usually assemblers do give feedback about changes they want to make to PCBs and in most cases I change the design instead of having the assembler change the Gerber files. In this case I wasn't involved in outsourcing the design to it could be that the assembler has given feedback and the customer gave an OK or not. From the little information I have it seems the PCB manufacturer has changed the Gerbers on their own though. This came to light after ruling any other possibility out and the problem had to be the PCB itself. The assembler wanted to order PCBs from a local quick turn around outfit which promptly complained the clearances where way below of what they could produce. From there it got clear where the problem is quickly.

In a broader sense having changes made to PCBs layouts by an assembler is a bit of a grey area. You'd say soldering is soldering but every assembler I have come across so far seems to have a specific setup / workflow. PCBs one assembler can solder without any problems are a nightmare to solder for the other (using the manufacturer specified land patterns and paste mask).
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 449
  • Country: de
  • ee - digital & analog
    • My services:
Re: what's your recent fail?
« Reply #65 on: May 06, 2021, 05:24:52 pm »
nctnico, thanks for the feedback. Just my observation: for series production, the PCB has to be optimized for the actual manufacturing process. The structures on the PCB plus the components must fit the solder profile. The series manufacturer will try to improve troughput by using the shortest time profile possible.
I see a similar thing with certain cable harness manufacturers, who turn the solder temperature way up, to reduce solder time.

A prototype manufacturer has most of the overall effort in setting up the production, so they  can take it easier on the speed. Tombstones are nonexistent when I solder with my lab equipment, rare with prototype manufacturing, but will appear in in series production.

I prefer to run prototypes in the series process. Since other restrictions apply (budget, time), I can rarely do this nowadays.
 

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 16611
  • Country: us
  • DavidH
Re: what's your recent fail?
« Reply #66 on: May 07, 2021, 11:51:12 pm »
After several days of randomly changing various things with no positive result, I was inspired to leave it powered up, and after several more days it POSTed.  Now it reliably POSTs, so far.

That's weird but... When I worked in a datacenter I had similar issues with motherboards that at first appeared to be dead. Like, I remember after a power outage I had to repair a very old server, in a crudely-made enclosure (replace and hdd or something like that). I dropped a screw on the motherboard. The server shut down itself immediately and denied to start again. Worse yet, I didn't have spare parts for old piece of junk it was. Lucky me, somehow it started after 15 minutes after tens of attempts to boot it.

I have no explanation yet and it took a couple days before it POSTed.  I will be installing Windows 7 on it and maybe it will fail again with some indication of what is going on.
 

Offline AlfBaz

  • Super Contributor
  • ***
  • Posts: 2184
  • Country: au
Re: what's your recent fail?
« Reply #67 on: May 08, 2021, 12:54:22 am »
After several days of randomly changing various things with no positive result, I was inspired to leave it powered up, and after several more days it POSTed.  Now it reliably POSTs, so far.

That's weird but... When I worked in a datacenter I had similar issues with motherboards that at first appeared to be dead. Like, I remember after a power outage I had to repair a very old server, in a crudely-made enclosure (replace and hdd or something like that). I dropped a screw on the motherboard. The server shut down itself immediately and denied to start again. Worse yet, I didn't have spare parts for old piece of junk it was. Lucky me, somehow it started after 15 minutes after tens of attempts to boot it.

I have no explanation yet and it took a couple days before it POSTed.  I will be installing Windows 7 on it and maybe it will fail again with some indication of what is going on.

I've just finished having similar issues.
My original rig died which required me to purchase a new power supply and motherboard. I had to go 2nd hand with the mother board so I didn't have to buy a new cpu.

The board would not power up after switching the power supply off or unplugging it. You have to wait several minutes after applying power to the psu before the power button would work.

I did some probing with the aid of a boardview file and found the 3VSB (standby) rail had a massive low frequency ripple from 3V to 1V at first and after some minutes it would stabilize to a solid 3V. At that stage it would work.

Initially I though it had something to do with the cmos battery which measured 3V but would drop to less than a volt if you loaded it with a handfull of mA's.
I replaced said battery and nothing changed.

I suspect a fault with the board somewhere that needs something to charge before the 3V rail comes good.

In this board the 3VSB rail is used to apply logic levels to the "super I/O" chip which handles pulling the psu's pwron pin low
 

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 16611
  • Country: us
  • DavidH
Re: what's your recent fail?
« Reply #68 on: May 08, 2021, 03:19:03 am »
I've just finished having similar issues.
My original rig died which required me to purchase a new power supply and motherboard. I had to go 2nd hand with the mother board so I didn't have to buy a new cpu.

The board would not power up after switching the power supply off or unplugging it. You have to wait several minutes after applying power to the psu before the power button would work.

I did some probing with the aid of a boardview file and found the 3VSB (standby) rail had a massive low frequency ripple from 3V to 1V at first and after some minutes it would stabilize to a solid 3V. At that stage it would work.

Initially I though it had something to do with the cmos battery which measured 3V but would drop to less than a volt if you loaded it with a handfull of mA's.
I replaced said battery and nothing changed.

I suspect a fault with the board somewhere that needs something to charge before the 3V rail comes good.

In this board the 3VSB rail is used to apply logic levels to the "super I/O" chip which handles pulling the psu's pwron pin low

I checked the voltages from the original power supply and found no problems, but changed the power supply anyway and it still did not work.  Then I messed with the CMOS battery and resetting the CMOS data without effect.  It finally POSTed after being left on for days and that was without the CMOS battery, but I have since put a new CMOS battery in and it still POSTs.

What I have not done is left it off since it first started POSTing again, but eventually that will happen.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf