Author Topic: STM H7 lasting only 2 years due to heating  (Read 2039 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
STM H7 lasting only 2 years due to heating
« on: April 04, 2025, 01:15:54 pm »
I've seen this before - it is a few years old now - but I wonder how many know about it

http://efton.sk/STM32/gotcha/g57.html

This issue is discussed in AN5337. At the maximum digital power supply voltage - VOS0 - and 90°C junction temperature (which at the highest frequencies and many peripherals running can be realistically reached at ambient temperatures around 50°C, nothing unusual in less-than-well ventillated enclosures) the expected lifetime is around 5 years, rapidly decreasing to 2 years with a mere 15°C increase of junction temperature to the maximum allowed 105°C.

I remember, many years ago, one UK manufacturer of industrial automation suffered a very high failure rate on a UART which was one of the TMS9900 family. They mostly packed up after a year or two, but it wasn't due to heat. It was some "silicon issue".

How can ST characterise the 2 year lifetime? Could it be 1 year instead in some cases?

The point is that 105C junction temp is not hard to reach. See e.g. here
https://www.eevblog.com/forum/microcontrollers/stm-32f4-reading-cpu-temperature/msg4981063/#msg4981063
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline m98

  • Frequent Contributor
  • **
  • Posts: 644
  • Country: de
Re: STM H7 lasting only 2 years due to heating
« Reply #1 on: April 04, 2025, 02:52:58 pm »
Is that just an overly cautious estimate with some huge margins, or is it actually that bad? 45 nm is still comparatively big and ancient, even consumer CPUs with much smaller structures already last way longer. Is it known where the failure point is?
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #2 on: April 04, 2025, 05:43:42 pm »
Indeed; I wonder what actually fails.

Thermal cycling is not implied here, although that is probably the #1 failure mechanism in electronics generally. But if you implemented some power-down stuff, or even just the proper RTOS stuff on idle threads, you will get big thermal cycling too.

The plastic above the chip is probably only 0.5mm thick so a heatsink would work, FWIW.
« Last Edit: April 04, 2025, 05:45:39 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8652
Re: STM H7 lasting only 2 years due to heating
« Reply #3 on: April 04, 2025, 10:47:16 pm »
45 nm is still comparatively big and ancient, even consumer CPUs with much smaller structures already last way longer.
Those CPUs are designed and tested by companies with much more experience at small process sizes... and even they get things wrong occasionally:

https://www.eevblog.com/forum/general-computing/intel-seems-to-be-bleeding-out/
https://www.eevblog.com/forum/microcontrollers/intel-atom-c2000-failures/
https://www.anandtech.com/show/4143/the-source-of-intels-cougar-point-sata-bug
 
The following users thanked this post: thm_w

Offline wek

  • Frequent Contributor
  • **
  • Posts: 561
  • Country: sk
Re: STM H7 lasting only 2 years due to heating
« Reply #4 on: April 05, 2025, 08:06:14 am »
>  45 nm is still comparatively big and ancient, even consumer CPUs with much smaller structures already last way longer.

Do they? We are talking about 24/7 operation at elevated temperatures here (105/140deg.C). Consumer CPUs are not operated at such temperatures; temperature management in the consumer CPUs is a big thing for a reason. Note that lifetime decreases with temperature exponentially.

Speaking of failure mechanisms, there may be many; besides the relatively obvious FLASH failure/leakage (which again is not a thing in the consumer CPUs), it may be also contamination from or through package (which again in the consumer CPUs is very different and way more expensive) and electromigration, which again may impact analog structures (again generally absent from consumer CPUs) much more than purely digital ones.

Also, the ST limitation is to be read from a statistical point of view: in semiconductors, the usual measure is failures within 1E9 hours of operation (i.e. approx. one failure per 10 million devices in 100 hours which is roughly a week) - hardly is there a cohort of ten million PCs running at high gear for several years, and would there occur a failure once a week in there, one would hardly attribute that particularly to processors being "worn out".

> How can ST characterise the 2 year lifetime? Could it be 1 year instead in some cases?

Through accelerated testing: they simply bake them at high temperatures while running, and estimate the effect at lower temperatures from that, using Arrhenius equation. It's again a statistical method, i.e. "in some cases" is part of the result as of course some spread is expected and lifetime is really a number where the failure-per-1E9-hours rises from sub-one to few ones, i.e. it's not that if you bake the part to 105deg.C it will fail when the clock rolls over exactly 2 years. This method also utilizes data from partial tests on similar structures (e.g. running various, including unrealistically high, currents through a thin conductor which is part of a testing structure purpose made for this kind of tests using the same process than what's used to produce the chips).

Btw. the 'G4 (again 45nm) suffers from the issue, too, although apparently significantly less. I haven't seen any such information about the other 45nm STM32 ('U5, 'H5), but I personally would assume the 'G4 AN applies to them, too, and try to stay at the safe side. It's probably obvious that I am not ST insider nor do I represent $M++ of purchasing power which would get me such information from ST (even if that would most probably be under NDA, too).

This all is of course just speculation from the information which is public. And, as speculations go, there's no point in drawing any other conclusion than just simply taking it into consideration and avoiding the circumstances causing them (i.e. either stick to 90nm designs, or avoiding running at the highest powers). I am not aware of ST publishing information about thermal data regarding heatsinks mounted on the package, and given those packages are not designed to have heatsinks mounted on them, I wouldn't expect ST ever publishing such information, as that would put them into risks when some step in mounting the heatsink (e.g. involving unexpected/untested chemicals and/or forces) would actually shorten the lifetime or lead to catastrophic failures. Nonetheless, if one can live with these risks, one can relatively simply characterize/estimate the effectiveness of such solution using the on-chip temperature probe.

JW
« Last Edit: April 05, 2025, 08:16:42 am by wek »
 

Offline voltsandjolts

  • Supporter
  • ****
  • Posts: 2755
  • Country: gb
Re: STM H7 lasting only 2 years due to heating
« Reply #5 on: April 05, 2025, 09:15:56 am »
Some of us with Agilent/Keysight 3446x series meters are well aware of STM SPEAR320 failures, although dunno if those failures are ambient temperature related or something else.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9779
  • Country: fi
Re: STM H7 lasting only 2 years due to heating
« Reply #6 on: April 05, 2025, 09:18:47 am »
For the few designs I used H7 on, I opted for external switch mode supply for Vcore, bypassing the internal linear regulator. For one, this halves your power consumption (compared to running out of 3.3V + internal linear regulator), and maybe more relevantly to this thread, more than halves the power dissipation inside that chip. And since the internal LDO is, AFAIK, on the very same die, it should make a HUGE difference on lifetime, too.
 
The following users thanked this post: peter-h, wek

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16379
  • Country: fr
Re: STM H7 lasting only 2 years due to heating
« Reply #7 on: April 05, 2025, 02:48:55 pm »
For the few designs I used H7 on, I opted for external switch mode supply for Vcore, bypassing the internal linear regulator. For one, this halves your power consumption (compared to running out of 3.3V + internal linear regulator), and maybe more relevantly to this thread, more than halves the power dissipation inside that chip. And since the internal LDO is, AFAIK, on the very same die, it should make a HUGE difference on lifetime, too.

I certainly recommend that too.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #8 on: April 05, 2025, 03:16:51 pm »
What do you do to ensure power supply sequencing?

EDIT: I found some appnotes and it looks like the on-chip linear regulator is always used at startup, and then you can set a bit in a register to enable the on-chip switcher which, with external components, does it a lot more efficiently.

Actually a lot of products use a heatsink on top of a plastic package.
« Last Edit: April 05, 2025, 04:52:16 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9779
  • Country: fi
Re: STM H7 lasting only 2 years due to heating
« Reply #9 on: April 05, 2025, 05:59:11 pm »
What do you do to ensure power supply sequencing?

EDIT: I found some appnotes and it looks like the on-chip linear regulator is always used at startup, and then you can set a bit in a register to enable the on-chip switcher which, with external components, does it a lot more efficiently.

Actually a lot of products use a heatsink on top of a plastic package.

Yes, it boots with internal regulator and you turn it off. I have done nothing special with power sequencing. The only problem I ever saw was that NRST really needed that filtering capacitor when having external Vcore. Well, it's required anyway if you read all the docs but somehow I had survived without it before. I don't know if it was just a coincidence, but without nRST filter cap I couldn't get the thing boot at all once external Vcore was present.
 
The following users thanked this post: kaevee, wek

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8652
Re: STM H7 lasting only 2 years due to heating
« Reply #10 on: April 06, 2025, 03:20:00 am »
>  45 nm is still comparatively big and ancient, even consumer CPUs with much smaller structures already last way longer.

Do they? We are talking about 24/7 operation at elevated temperatures here (105/140deg.C). Consumer CPUs are not operated at such temperatures; temperature management in the consumer CPUs is a big thing for a reason. Note that lifetime decreases with temperature exponentially.
I have had an Intel 8th-gen (14nm) CPU at thermal throttling temperature (100C) for 24/7 for the past 6 years in a laptop that was doing some long-running calculations, so constant full-load operation; the SSD got cooked first and died after ~3 years, but the CPU is still fine.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #11 on: April 06, 2025, 06:16:26 am »
The NRST cap for 32F4 is only 10nF. I wonder why it matters. There must be another spec for the min speed of VCC rise.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 6575
  • Country: es
Re: STM H7 lasting only 2 years due to heating
« Reply #12 on: April 06, 2025, 10:34:42 am »
Consumer CPUs are not operated at such temperatures; temperature management in the consumer CPUs is a big thing for a reason. Note that lifetime decreases with temperature exponentially.
Forget the days when you could place a huge heatsink and run full load 24/7 at 50ºC...
Modern CPUs are designed to work at 90ºC all day, and if it doesn't, turbo boost will increase voltage and frequency until it does.
The dies became so small that the heat can't get out as fast as they did with larger nodes, so the only way to get rid of it is higher Tj.

The point is that 105C junction temp is not hard to reach. See e.g. here
https://www.eevblog.com/forum/microcontrollers/stm-32f4-reading-cpu-temperature/msg4981063/#msg4981063
Then don't place the stm32 right next to a burning hot LDO!
If you can't... maybe put a split in the pcb to thermally isolate that section.
Ever heard of heatsinks and fans?  :)
« Last Edit: April 06, 2025, 01:26:58 pm by DavidAlfa »
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #13 on: April 06, 2025, 10:42:27 am »
Quote
Then don't place the stm32 right next to a burning hot LDO!
If you can't... maybe put a split in the pcb to thermally isolate that section.
Ever heard of heatsinks and fans?

I am doing none of that :)

But it is a fair point that using a linear LDO external to the CPU is way better than dissipating the same heat on the CPU chip.

Getting back to that NRST cap, the value must relate to the minimum VCC (+3.3V) rise spec, otherwise it is a bit meaningless. But maybe it relates to the internal MOSFET which pulls down NRST?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 6575
  • Country: es
Re: STM H7 lasting only 2 years due to heating
« Reply #14 on: April 06, 2025, 01:29:41 pm »
If you did wrong and it worked, consider yourself lucky, but don't overthink it when failing because you weren't doing it right in first place!
Just follow the spec and go have a beer in peace  :)!
« Last Edit: April 06, 2025, 01:31:18 pm by DavidAlfa »
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #15 on: April 06, 2025, 09:46:19 pm »
It would be an interesting experiment to stick a heatsink on an STM32 package and see what it does.

I did a quick test with a QFP100 32F417VGT6.

Measuring the chip temp (see the above linked thread on CPU temp measurement) I have

+46.5C no heatsink
+42.0C little heatsink (pic 1 below)
+33.1C "infinite" heatsink (pic 2 below)

heatsink 1 - 1cm high


heatsink 2 - a 10cm tall block of aluminium :)


Obviously a high grade thermal compound was used.

The PCB temp, measured with a PT100 on the PCB, is about +31C but the exact value of this as seen by the device is obviously important to understand the infinite heatsink result above. With a PCB to chip delta of only about 2C (2K for the purists) it certainly does look like the infinite heatsink is highly effective through the QFP100 package - just as I would expect given the chip is some 5x5mm and the plastic is only some 0.5mm thick. Similarly any decent heatsink, or a heat pipe setup, should be very effective.

The small heatsink is near-useless, but it is tiny. It would probably do something if you are running an H7 at +105C and there is some forced airflow around ;)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 
The following users thanked this post: thm_w, Siwastaja

Offline jnk0le

  • Regular Contributor
  • *
  • Posts: 136
  • Country: pl
Re: STM H7 lasting only 2 years due to heating
« Reply #16 on: April 07, 2025, 12:20:22 am »
Some large QFP packaged chips provide a thermal pad, so the pcb can take some heat away.
H7 family (H723Zxx etc.) and STMs in general don't. (except the qfn of course)

Though, through plastic heatsinks are quite common, so it wouldn't be any worser with STMs.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 561
  • Country: sk
Re: STM H7 lasting only 2 years due to heating
« Reply #17 on: April 07, 2025, 09:29:22 am »
The second "heatsink" is not necessarily better a heatsink. It may be also the way how you mounted it - it appears to me that you may have used too much thermal compound (which in fact has a very poor thermal conductivity and is intended to improve somewhat the conductivity by filling in the roughness, as it still has way better conductivity than air that would otherwise be there), and the heavier item might have squeezed out the excess better.

Also, would there be the same contact, the second is simply a bigger thermal mass (while having a great conductivity), so it simply might have taken a way longer time until it heats up (i.e. stops cooling). There might be some chimney effect along its length, but I doubt it would be that pronounced.

Btw. a nice experiment is to observe the internal temperature probe's output (provided it's filtered enough to stop fluctuating excessively) and then simply touching the surface of the package by the index finger. It's counter-intuitive, but given the internal is >37deg.C, the finger does provide observable cooling.

JW
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9779
  • Country: fi
Re: STM H7 lasting only 2 years due to heating
« Reply #18 on: April 07, 2025, 10:07:55 am »
The second "heatsink" is not necessarily better a heatsink. It may be also the way how you mounted it - it appears to me that you may have used too much thermal compound (which in fact has a very poor thermal conductivity and is intended to improve somewhat the conductivity by filling in the roughness, as it still has way better conductivity than air that would otherwise be there), and the heavier item might have squeezed out the excess better.

Also, would there be the same contact, the second is simply a bigger thermal mass (while having a great conductivity), so it simply might have taken a way longer time until it heats up (i.e. stops cooling). There might be some chimney effect along its length, but I doubt it would be that pronounced.

Btw. a nice experiment is to observe the internal temperature probe's output (provided it's filtered enough to stop fluctuating excessively) and then simply touching the surface of the package by the index finger. It's counter-intuitive, but given the internal is >37deg.C, the finger does provide observable cooling.

Such solid block of metal is a good approximation of "inifinite heatsink" given that the test time is short enough, so while not very effective in actual application (poor weigh-to-surface area ratio), it's great for short testing. And I believe peter-h has no reasons to wait for more than a few minutes.

Correct amount of paste is very hard to see from images. Squeezing out is not a a bad sign (except losing $0.02 worth of expensive paste), these pastes are designed to squeeze. In fact, excessive amount seems like simplest way of reducing risks of voids, something which happens easily after reading misinformation online that "you must not use too much of the paste" or "use as little as possible".

But it boils down to the force being used. With enough force, excess paste is simply squeezed out, voids minimized and thermal conductivity maximized. Good thermal pastes are designed to flow easily enough so that required force to obtain thin layer is not too much to cause board flexing enough to destroy MLCCs etc. And on the other hand, if you press way too lightly, the paste is not going to spread. Then it's irrelevant if you used "as little as possible" or too much; it will fail anyway.

If you try to apply some "optimum layer thickness" directly (say 0.1mm), then you are guaranteed to have voids. It needs to squeeze, that way it gets everywhere. Optimum way to spread is to put a large blob in the middle (not spread it), press down evenly and observe it squeezing out from all four sides at approximately the same time.

Similar principles are relevant in stuff like screen printing or solder paste printing. The more, the better: excess squeezing out from every corner is the proof it is also everywhere it needs to be.

It is possible though that the weight of the "heatsink 2" contributed to good thermal contact. Maybe "heatsink 1" would have performed better if "heatsink 2" was used as a tool to press it down against the chip.
« Last Edit: April 07, 2025, 10:11:38 am by Siwastaja »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9779
  • Country: fi
Re: STM H7 lasting only 2 years due to heating
« Reply #19 on: April 07, 2025, 10:14:31 am »
... and then again, I would not recommend putting effort into heatsinking these MCUs. It's not like the actual power dissipation is that much. The problem is ST artificially promoting the ease of using the integrated linear regulator, more than doubling the power dissipation. It makes little sense, because external regulator could manage higher Tj than this processor, and heat could be moved away over larger area, possible inches away. So my tip, again, is, don't use the internal regulator and the problem's completely gone. A tiny external regulator (linear or switch mode) costs one tenth of the heatsink.

But maybe heatsinking is the only way to salvage existing boards.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4634
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM H7 lasting only 2 years due to heating
« Reply #20 on: April 07, 2025, 10:17:23 am »
I made sure the compound layer was really thin. That first heatsink would be poor. Too small. Probably 50-100C/watt. It would work if there was a fan nearby but most people avoid fans; they always fail after some years due to dust.

Sure; that aluminium block would have heated up eventually :) I did see the CPU chip temp rising although very slowly. It was just a test to see how well heat flows out of the device, and clearly it flows very well. I think if one had to get rid of 1-2W (probably about right for an H7 with 105C Tj) then a little heat pipe arrangement would work well. One doesn't need to achieve much of a drop to improve device life dramatically.

The ally block (a part of an aircraft nose landing gear) had a very smooth and straight surface on its end. I machined it myself :)

One day I will go back to doing the Tj measurement properly i.e. calibrate the sensor. One has to hack the code to take the temp reading very fast (a millisecond) after power-up.

Just looked up some 1.8V regs, 300mA+. They are amazingly cheap e.g. MC33375ST-1.8T3G is about 20p. Then chips like AMS1117 (1A) on LCSC. Some of the prices are crazy low - a few pence. Using an external reg is really a no-brainer.
« Last Edit: April 07, 2025, 10:53:17 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 561
  • Country: sk
Re: STM H7 lasting only 2 years due to heating
« Reply #21 on: April 07, 2025, 11:36:12 am »
> ST artificially promoting the ease of using the integrated linear regulator

I don't think they are *promoting* it in any particular way. It's simply a convenience.

The whole issue is about user awareness - this situation is quite new to those who were accustomed to the 8-bitters or even the lower-end 32-bitters.

JW
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9779
  • Country: fi
Re: STM H7 lasting only 2 years due to heating
« Reply #22 on: April 07, 2025, 11:53:59 am »
The whole issue is about user awareness - this situation is quite new to those who were accustomed to the 8-bitters or even the lower-end 32-bitters.

Yeah. The scope for internal linear regulator is quite narrow. Low- and mid-end chips do not need it, because they operate the whole chip at 3.3V. Highest-end should not use it, because it causes too much power dissipation, which external regulator solves. This leaves basically something like STM32F7 running at ~200MHz, where power dissipation isn't a problem yet, but the convenience from internal regulator is very real.

Users should be guided to use external regulator but you won't see a lot of that in documentation, appnotes or example circuits.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 561
  • Country: sk
Re: STM H7 lasting only 2 years due to heating
« Reply #23 on: April 07, 2025, 12:36:40 pm »
> Low- and mid-end chips do not need it, because they operate the whole chip at 3.3V.

No, they don't. All STM32's digital core operates from 1.8V to 1.2V and below, even in models where this voltage domain is not exposed to pins.

JW
 
The following users thanked this post: hans

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16379
  • Country: fr
Re: STM H7 lasting only 2 years due to heating
« Reply #24 on: April 07, 2025, 12:56:18 pm »
I don't think using a heatsink on something like a STM32F4 MCU makes much sense, sure you'll see a non-negligible difference, but between like 30°C and 45°C, who cares, that's barely tickling it.
For a H7 running at full speed, that's something that can be warranted depending on the environment, although I have rarely seen that done in practice.
That depends on the environment of course. The die temperature will be relative to the ambient temp so in hot environments, the die may more easily reach dangerous territory.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf