Well, a few things at once -- the die was never big for energy handling by itself, and that's over in a hundred microseconds or so. By a few ms, heat is flowing into the copper tab, and it matters whether it's a weedy little SMT (I don't know the dimensions, but I'm guessing this is at least in part why SOT-89 always has such feeble ratings, despite what you'd think of its direct mounted tab design, in contrast to SOT-223's long gull-wing tab) or a thick plate as in TO-220AB (not the minimal garden-variety TO-220, mind you!) or the like. Then in the 10-100 ms range, heat flows out of that tab into the heatsink (if any), and in the 1-100s range, heat spreads across the heatsink itself.
A small SMT like this, will never have much energy or power capacity, regardless of the process node used for its chip; even up to DPAK will be only a little bit better, still having a fairly thin tab. D2PAK and power (THT) devices are where you need to go for energy handling at modest time scales.
Again, this is all filtered through the Rds(on) and speed vs. energy dissipation over time situation, so you might have a very small but low-Rds(on) part which, when switched quickly, seems more robust than a much beefier part; or if switched at an unfortunate enough (slowish) rate, performs about as poorly as you'd expect given its diminutive size. So, without controlling for those variables, things that are all happening in well less than a blink of an eye -- it can seem random which ones are robust or not.
Not to even mention 2nd breakdown, which is hit or miss with modern devices. Some are quite severely limited, others are shockingly wide despite their high power density.
Also just to clarify, sub-200V or so transistors aren't doing anything *too* crazy, like SuperJunction stuff. They're just very highly optimized trench VDMOS or whatever. Or maybe there are specific techniques applicable there that I've not read about yet, not sure. SJ is coming to lower and lower voltage ratings, though: the tinner they can make the SJ pillars, the lower it's useful to. They started at 600V I think (also the most lucrative and important market segment to improve with it), and are now down to 200V or so. Sub 100V may be coming yet.
I wonder if they would ever use (or if it can be used at all) SJ on LDMOS (lateral RF) parts, heh. Have seen a couple articles about charge-compensated lateral structures, seems possible. Maybe not as advantageous as it is for power switching. Shouldn't be any particular advantage with voltage, I think; you're still limited by drift through the channel (regardless of how it's doped), reducing performance above 100V or so.
Tim