Author Topic: FCBGA: Why does heating the inner solder of a chip 'revives' it?  (Read 3639 times)

0 Members and 1 Guest are viewing this topic.

Offline jotwerdeTopic starter

  • Contributor
  • Posts: 18
  • Country: bg
So, I've watched this video from Louis Rossmann:




If I got it right, then what he says is that reballing a dead flip-chip GPU doesn't repair it and that you have to replace the chip to repair it properly, as the problem is that the GPU has died and not the solder balls cracked.
He also mentions that, while reballing a GPU the inside of the chip also gets heated, reflowing the solder INSIDE the chip and that this is why the chip will work for a while afterwards, not the reballing.

I guess the solder inside the chip he talks about are the solder bumps in the following pics:



But why does reflowing the solder-bumps inside a dead GPU (essentially 'inner'-reflowing of the chip, if I get this right) revive it? I mean, the gpu is dead, why would reflowing the bumps fix it?

Thanks.
 

Online Ian.M

  • Super Contributor
  • ***
  • Posts: 13216
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #1 on: August 29, 2018, 03:05:29 pm »
Its probably not 'dead' dead unless the ball that failed was supplying a critical power rail with tight sequencing requirements.

Thermal expansion and contraction of the die relative to the carrier causes the fault by fatiguing the inner balls till they crack, then expansion and contraction of the underfill 'mushes' the metal at the crack surfaces until they are no longer in contact.  If you heat the chip to inner ball reflow temperature, surface tension evens out the surfaces either side of the crack, probably doming them towards the center of the pad.  However with no opportunity to add fresh flux, the solder either side of the crack doesn't reunite due to surface oxidisation and possible contamination from the underfill outgassing.  As the undefill cools it forces the solder surfaces together and the contact pressure resulting from the difference between the reflow temperature and the max die temperature keeps it working until the solder mushes too much at the point of contact by underfill expansion/contraction due to repeated thermal cycling, especially switching off.
 
The following users thanked this post: KL27x, jotwerde

Offline jotwerdeTopic starter

  • Contributor
  • Posts: 18
  • Country: bg
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #2 on: August 29, 2018, 04:08:37 pm »
the ball that failed was supplying a critical power rail with tight sequencing requirements.

I don't really understand what you mean here, could you explain this sentence a little more? Thanks.
If I got the rest of your answer right, then what I misunderstood was that with 'dead GPU' he doesn't mean that the GPU has completely failed, but that there's no way to fix the inner solder cracks, right?
 

Offline ArthurDent

  • Super Contributor
  • ***
  • Posts: 1193
  • Country: us
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #3 on: August 29, 2018, 04:09:45 pm »
What Ian.M says makes sense to me but his saying: “Its probably not 'dead' dead….” reminds me of this Monty Python skit.



I admit that I know little or nothing about this problem but it seems to me if there are also solder bonds inside the chip, the alloy must be such that the melting point is much higher than what would be used to solder the chip to the board, otherwise any time you soldered a chip on a board, you’d stand a chance, however small, of messing up the connections on the innards of the chip. If the problem is between chip and board could it be exacerbated by ROHS?

If the chip/board bond is broken then I would think a chip removal, cleaning, new flux, and new solder balls would be needed otherwise any apparent repair from just reheating might not last long. If the problem is internal then a new chip is probably called for.
 
The following users thanked this post: jotwerde

Offline Gyro

  • Super Contributor
  • ***
  • Posts: 10172
  • Country: gb
« Last Edit: August 29, 2018, 04:27:48 pm by Gyro »
Best Regards, Chris
 
The following users thanked this post: jotwerde

Offline David Hess

  • Super Contributor
  • ***
  • Posts: 17427
  • Country: us
  • DavidH
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #5 on: August 29, 2018, 07:24:23 pm »
This problem crops up occasionally with hybrids and lead frame die attachments for ICs where solder reflowing may actually fix it but flip-chip packaging is more fragile for the reasons Ian.M gives.
 
The following users thanked this post: jotwerde

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3382
  • Country: gb
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #6 on: August 29, 2018, 08:22:35 pm »
the ball that failed was supplying a critical power rail with tight sequencing requirements.

I don't really understand what you mean here, could you explain this sentence a little more? Thanks.
If I got the rest of your answer right, then what I misunderstood was that with 'dead GPU' he doesn't mean that the GPU has completely failed, but that there's no way to fix the inner solder cracks, right?

Ian is saying the die itself is likely not damaged unless the broken solder ball caused damaging currents to flow.  GPUs tend to have tight power sequencing requirements for their various supply rails, so if one of these became disconnected due to a broken solder bump between the die and carrier, then it's possible that damaging internal currents could flow through parasitic diodes etc.  If the broken solder bump was carrying a logic level signal then chances are the die is still fine.
 
The following users thanked this post: Ian.M, jotwerde

Online Ian.M

  • Super Contributor
  • ***
  • Posts: 13216
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #7 on: August 29, 2018, 08:39:33 pm »
Yes.  Another permanent death scenario is loss of a control line causing bus contention resulting in I/O driver burnout.   *IF* one had the bare die, the resulting failure would often be quite obvious under an electron microscope with an irregular crater where there should be well defined quasi-rectangular metallisation and SiO structures.

Hmmm ...... I wonder if there is a market opportunity for an enterprising company with IC packaging and failure analysis experiance, to buy up unobtanium as N.O.S. high value flip-chip on carrier BGAs as cheap failed parts, separate the die from the carrier, auto-inspect for die damage, then re-bump and reflow and bond onto a new carrier, ball that, and test on a 'franken-card' that has had its key BGA replaced with a test socket, so the result can be resold with a 1 year parts only warranty?
« Last Edit: August 29, 2018, 08:45:09 pm by Ian.M »
 
The following users thanked this post: KL27x, jotwerde

Offline jotwerdeTopic starter

  • Contributor
  • Posts: 18
  • Country: bg
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #8 on: September 08, 2018, 07:37:15 am »
Ian is saying the die itself is likely not damaged unless the broken solder ball caused damaging currents to flow.  GPUs tend to have tight power sequencing requirements for their various supply rails, so if one of these became disconnected due to a broken solder bump between the die and carrier, then it's possible that damaging internal currents could flow through parasitic diodes etc.  If the broken solder bump was carrying a logic level signal then chances are the die is still fine.

How much of a difference is there usually between logic level signal currents and power currents?
 

Offline janoc

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: de
Re: FCBGA: Why does heating the inner solder of a chip 'revives' it?
« Reply #9 on: September 08, 2018, 09:18:37 am »
How much of a difference is there usually between logic level signal currents and power currents?

Like several orders of magnitude?

Logic inputs have high impedance, so you get at best short time current spikes while the input capacitances charge and discharge, but the average currents are low, less than 100mA for sure.

The power lines carry the power to the chip - if the card needs e.g. 100W of power, at 12V -> 100W/12V = 8.3A Of course, that is not going to be through a single line and not all that power is going to the GPU (some is also consumed by the RAM, voltage regulators, fans, etc.) but still. Gives you an idea of the magnitude difference.
 
The following users thanked this post: jotwerde


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf