None of them are very good.
TO-3 (TO-204, etc.) is usually a steel package, and not usually very flat. Although the all-metal design does typically allow slightly higher junction temperatures (think I've seen 225C before).
TO-247 is much easier to use: PCB mounted, single screw to heatsink, insulated hole (no freaking shoulder washers!). TO-3P is an, I think Japanese creation, which is very similar.
TO-264 is much larger, but the same basic thing.
But none of them are exceptional, because you have the same fundamental problem: getting over 100W of heat out through a small footprint, very likely through an insulator. All insulators suck. Either the conductivity is poor (mica, Kapton, Sil-Pads, etc.), or the material is very thick (AlN, BeO and such), or the flatness sucks (TO-3 in particular, but extruded heatsinks are an offender too). And all the good ones need grease, which is still more assembly work.
It really doesn't make much difference in packaging, because the insulators are all awful. Make your choice on which is easier to connect to, and to assemble. TO-247 and Sil-Pad is hard to beat. TO-3 can't be put into a PCB, not in a serviceable manner, because it has to pierce the heatsink.
Really, the underlying truth, more illuminating than making tedious statements about crappy packages, is -- dissipating power with semiconductors is expensive and sucks.
In power dissipation applications (active load, battery conditioner?), you can offload a lot of dissipation into a stack of resistors, saving considerably on heatsinking requirements and power transistors and mechanical assembly. This is a lot harder to implement for amplifiers, so you're kind of stuck in that case (think of it as a "cost of doing business" to get a good, low noise, high speed amplifier). That's where the magic of class D amplifiers comes in; if you don't need the speed or the noise requirement (which is largely the case in slow, audio applications), they solve all of this.
BTW, TO-247 doesn't exhibit the flex that TO-220 is notorious for. It is still uneven, but not as bad, and smooth, flat, greased mating surfaces do fine. You can also secure them with clips (required for the unholey MAX247 variant!), which can be even easier for assembly (there are clip-on types that don't need a screw at all).
For more than 100W per package (or maybe 150W for TO-264), consider using multiple devices in parallel (mind the current sharing) or larger devices (ISOTOP / SOT227 or various industrial modules). TO-220 and 247 are so damn cheap, they're hard to compete with, as long as you can connect them in parallel effectively; just keep piling them up.
Tim