EEVblog Electronics Community Forum
Electronics => Beginners => Topic started by: rthorntn on October 15, 2024, 04:48:43 am
-
Hi,
I was reading a thread on ESD killing XMOS audio interfaces (I have two older USB DACs with XMOS that are no longer detected by the OS) and a poster mentioned that the failure analysis they got stated EOS/ESD as the cause.
Is there a way for a home tinkerer to find evidence of ESD failure on a board, with a stereo microscope perhaps?
Thanks.
-
Unless it's a gross failure, it's not really possible.
Even testing at the chip/device level to full specifications may not detect long term damage issues.
-
Yeah, small hidden damage is impossible to detect and causes massive loss of time and insanity. Therefore some level of ESD protection is advisable especially to beginners and hobbyists. No drastic or expensive measures are needed; a simple workbench ESD mat bounded to earth connection is usually more than enough. Your hands would periodically rest on the ESD mat while working.
-
Like Dave wrote, only if it is from hefty discharge like lightning that leaves massive destruction you can be certain.
Otherwise it would most likely take an electron-microscope to inspect every component on chip level to see if there is some damage. Takes a lot of time to open up every chip on the board to expose the die and requires knowledge about how the structure should look anyway.
So no way a home tinkerer would be able to determine death by human ESD impact.
-
Certain parts of the board are more prone to ESD damage than others, particularly the USB input circuitry and any protection diodes that may have been designed to handle such discharges.
-
One of the weird things about ESD damage is that it can cause "premature ageing" failures. This probably isn't the right terminology, but I'll explain.
I used to work as a repair technician on a product that made extensive use of 4000-series logic chips - probably the most prone to ESD damage of any chips. Large numbers of these would come in for repair after several weeks or months of service in the field. My workplace was fully ESD protected and once repaired, I would never see them again.
My employer (BT) investigated the manufacturer (Trend) and found that they were using no ESD protection at all. The completed circuit boards were stacked on top of each other with cardboard between them. Trend completely overhauled their processes and we never had a failure after that.
Lessons:
1/ An ESD-damaged component can test perfectly and work perfectly for weeks or months - sometimes longer.
2/ ESD vulnerability is a real thing, albeit less so with modern components.
3/ ESD protection (grounded work mat and wristband - via the correct resistance) works brilliantly.
-
Back in 82-83 I use to work at Datacraft doing Codex 96V29 leased line modems. The blue shoe box. Pretty hot stuff back in the day. Three of the plug in boards had a 15 x 6 array of socketed wire wrapped 4000 series CMOS. Standard procedure for production repairs when you couldn't find the fault by probing around with the CRO was to lift out the ICs one row at a time and plug them into a piece of styro foam in the same layout as they came out of the board so you could put them back in the right socket when you found a bent pin or bad socket or similar. Never had one fail on the bench but I often wondered what happened down the track. :palm:
-
There are specific issues that are very likely to be caused by ESD such as significantly increased leakage current on CMOS inputs. But that's just one sort of possible damage and not conclusive at that.
-
Far above the realm of tinkering, the technology to see the damage is available:
https://desco.descoindustries.com/PPT/ImagesOfESDDamage.ppt (https://desco.descoindustries.com/PPT/ImagesOfESDDamage.ppt)
In addition to an SEM (as pcprogrammer already pointed out), it takes substantial know-how and other equipment. I doubt that anyone has ever done this in order to test or repair something. It is R&D work to avoid future problems.
-
One of the weird things about ESD damage is that it can cause "premature ageing" failures.
Yep. Sloppy production lines, that don't control static properly, not only find a lot of failures at the test station at the end of the production line, they also get a lot of returns from early failures in the field.
-
The only way to find if a failed device failed through EOS or some other cause is to decap and inspect the die. There are a lot of poorly controlled production lines out there, so a silicon vendor gets a steady stream of chips sent back for failure analysis, and the vast majority of test results come back as EOS. That could be from ESD, or from some stress applied in operation, like a poorly controlled power supply (e.g. startup spikes). Its quite hard to tell the cause, and the lab reports don't usually try. They just try to distinguish between the physical damage due to over stress and a failure due to a defect on the die which got past assembly and test. For power devices they will also look for a failure due to thermal stress.
-
One of the weird things about ESD damage is that it can cause "premature ageing" failures.
Yep. Sloppy production lines, that don't control static properly, not only find a lot of failures at the test station at the end of the production line, they also get a lot of returns from early failures in the field.
These days I wouldn't be surprised if they don't care as long as the majority of the failures are just outside of warranty.
-
Hi,
EOS - Electrical Over Stress
This is used to describe catastrophic failure of part where there is significant damage to the part. The die is damaged to the point that only limited failure analysis can be performed. It can be caused by many internal or external events. Exceeding a components absolute maximum ratings could result in EOS damage. Sometimes circuits which are far away from the pins are damaged. This can also be reported as EOS. Quite often the plastic is cracked.
ESD damage.
Most pins have ESD structures to protect the silicon from electrostatic discharge. Different pins may have different structures. If the de-capping and examination show that the only the ESD structure is damaged, the failure analysis report will probably identify the cause as ESD.
Curve tracing
Using a VI curve tracer for example a Tektronix 576 or Huntron Tracker can be used to identify damage to pins. This is done by comparing a good device to a suspect device.
Regards,
Jay_Diddy_B
-
Thanks all!
Wondering out loud if something like the the Jumperless https://www.crowdsupply.com/architeuthis-flux/jumperless-v5 (https://www.crowdsupply.com/architeuthis-flux/jumperless-v5) and some sort of pogo pin "jig" that ran code to cycle through a battery of tests designed for each IC is a feasible system to use to detect such issues?
-
Thanks all!
Wondering out loud if something like the the Jumperless https://www.crowdsupply.com/architeuthis-flux/jumperless-v5 (https://www.crowdsupply.com/architeuthis-flux/jumperless-v5) and some sort of pogo pin "jig" that ran code to cycle through a battery of tests designed for each IC is a feasible system to use to detect such issues?
I think there will be ESD damage that won't be detectable by exercising the chip electrically, until it fails. However, I cannot be absolutely certain. I doubt that a comprehensive study has been made - it would require hundreds or thousands of chips of all different types, variable strength electrostatic discharges made to a random subset of them, comprehensive testing and characterisation of input currents, functionality, power currents, temperatures of all the chips. Then they would each need to be installed in a "product" (in reality, a test bed that exercises their functionality) and run until they fail, which means (in my experience) at least a year. Or don't fail, of course.
Only then can we know whether there were any signs at the beginning of the test that predicted a failure a year later.
As I say, I doubt it has been done, but I might be wrong. Perhaps some sort of characterisation is performed on chips destined for spacecraft, undersea cables - anywhere a failure would be catastrophic and prohibitively expensive. I would imagine the tests would be to select chips right from the middle of the range on the test parameters. I doubt there would be much interest in then testing the rejected chips to see if they do fail. They'd be onto the next job.
I would love to hear other's experience in this topic.
-
Thanks all!
Wondering out loud if something like the the Jumperless https://www.crowdsupply.com/architeuthis-flux/jumperless-v5 (https://www.crowdsupply.com/architeuthis-flux/jumperless-v5) and some sort of pogo pin "jig" that ran code to cycle through a battery of tests designed for each IC is a feasible system to use to detect such issues?
I think there will be ESD damage that won't be detectable by exercising the chip electrically, until it fails. However, I cannot be absolutely certain. I doubt that a comprehensive study has been made - it would require hundreds or thousands of chips of all different types, variable strength electrostatic discharges made to a random subset of them, comprehensive testing and characterisation of input currents, functionality, power currents, temperatures of all the chips. Then they would each need to be installed in a "product" (in reality, a test bed that exercises their functionality) and run until they fail, which means (in my experience) at least a year. Or don't fail, of course.
Only then can we know whether there were any signs at the beginning of the test that predicted a failure a year later.
As I say, I doubt it has been done, but I might be wrong. Perhaps some sort of characterisation is performed on chips destined for spacecraft, undersea cables - anywhere a failure would be catastrophic and prohibitively expensive. I would imagine the tests would be to select chips right from the middle of the range on the test parameters. I doubt there would be much interest in then testing the rejected chips to see if they do fail. They'd be onto the next job.
I would love to hear other's experience in this topic.
Back in the 80s BT (or was it still the GPO at the time?) did extensive studies of component failures, trying to find why early electronic telephone exchanges, like the TXE4, worked smoothly for ages, crashed, and usually were OK after a reset. The biggest issue turned out to be tantalum capacitors growing whiskers, and we moved to solid aluminium ones. Those appeared in the market at just the right time. However, among this work a lot of dead semiconductors were inspected, and a lot of writeups and photographs were distributed to BTs pool of suppliers. The photographs of ESD damage were really surprising. Everyone expected the damage would be around the I/O ring, and mostly it was. However, quite often a pin hole could be seen punched somewhere in the heart of the device. It seemed very strange, and I don't remember seeing an analysis that really got to the true cause. Just a lot of speculation. Most total failures leave physical damage that goes well beyond the failing device, so there is so much damage something critical to the product's operation has almost certainly been affected. However, when the stress is just quite bad, but not of the popping level, a visual inspection with a SEM doesn't seem to show anything, yet things like pin protection diodes may no longer be doing their job.
In the 70s I had several MPUs of different types that responded to shaking. Shake them and things were OK, shake them again and they crashed, shake them again and they might be OK again. I did a detailed analysis of 2 of these. One was an MC6800. I can remember the other. I was able to determine exactly what was breaking and fixing itself. The MC6800 had one bit in the ALU stop working. The other I think had an internal bus bit failing.
-
Fascinating! I, too, worked for BT, starting in 1973 and leaving at the end of 2005, 32 years later. The first half of my career was in the field, then exchanges, then an office. The second half at their research labs at Martlesham Heath. I was lucky - I enjoyed my career enormously.
-
Fascinating! I, too, worked for BT, starting in 1973 and leaving at the end of 2005, 32 years later. The first half of my career was in the field, then exchanges, then an office. The second half at their research labs at Martlesham Heath. I was lucky - I enjoyed my career enormously.
I think the work I described all happened at Martlesham Heath. Did you ever meet Charlie Kao? It seems everyone I knew in UK telecoms met him except me. :)
-
Fascinating! I, too, worked for BT, starting in 1973 and leaving at the end of 2005, 32 years later. The first half of my career was in the field, then exchanges, then an office. The second half at their research labs at Martlesham Heath. I was lucky - I enjoyed my career enormously.
I think the work I described all happened at Martlesham Heath. Did you ever meet Charlie Kao? It seems everyone I knew in UK telecoms met him except me. :)
I haven't heard of him, unfortunately. Mind you, there were 5000 people there so I only got to work with a small subset of them.
Sorry - this is way off-topic! Back to invisible ESD damage.... :)
-
Fascinating! I, too, worked for BT, starting in 1973 and leaving at the end of 2005, 32 years later. The first half of my career was in the field, then exchanges, then an office. The second half at their research labs at Martlesham Heath. I was lucky - I enjoyed my career enormously.
I think the work I described all happened at Martlesham Heath. Did you ever meet Charlie Kao? It seems everyone I knew in UK telecoms met him except me. :)
I haven't heard of him, unfortunately. Mind you, there were 5000 people there so I only got to work with a small subset of them.
Nobel prize winner. The guy who realised fibre optic communications was possible, because losses in sufficiently pure glass could approach zero. At UCL and STC/Nortel and in Hong Kong I kept being close by the guy and never met him. I believe he spent a lot of time at Martlesham Heath. By the early 80s he was someone everyone wanted to meet.