This leaves basically something like STM32F7 running at ~200MHz, where power dissipation isn't a problem yet, but the convenience from internal regulator is very real.
Actually if you look at a real product, say an enclosed case, mounted in some industrial DIN rail enclosure, Tamb perhaps 45C, and some internal dissipation, you pretty easily get to the max Tj, with a 32F417. The chip is easily doing 15C above the PCB temp, even doing almost nothing; 168MHz. I am running FreeRTOS with a proper wait on idle tasks.
https://www.eevblog.com/forum/microcontrollers/freertos-where-should-the-cpu-be-spending-most-of-its-time/msg4970857/#msg4970857I can see +55C Tj even with the PCB lying open on the bench. Probably doing a lot of work, but...
using a heatsink on something like a STM32F4 MCU makes much sense, sure you'll see a non-negligible difference
A miniscule heatsink like the 1.5cm x 1.5cm x 1cm H one in my above pic, yeah, that is useless. Especially as they are mostly (in chinese routers etc) glued on with some double sided adhesive pad

But with a sensible design you could knock 20C off Tj with something bigger. If Tj is say +100C and Tamb is say +50C then the thermal gradient will be pretty nicely in your favour. I could stick a piece of aluminium between the CPU and the metal can of the nearby RJ45 though

It's a lesson for me, seeing these 1.8V LDO regs costing practically nothing. I don't have room for even an SO-223 one on my current board... well not without spending a lot of time

However, a quick read of the DS suggests it is not completely trivial. One needs an external reset controller, to detect the +3.3V falling to say 3.0V. Some issues with PA0 also, which is no longer usable for GPIO.