Yes it does have an overhead however you see it.
But the main point is elsewhere: as I think we already discussed in earlier threads, CMOS processes that offer embedded Flash are fewer and more expensive, so the cost of dies is significantly higher, whether you actually embed Flash or not, or even if it's just a small area of it. While OTP is usually available on a wider range of CMOS processes at a lower price point.
Most modern processes are only set up for flash.
Nope. But just have a look at the offering at TSMC, GlobalFoundries and whatever else you like.
Specify some actual processes that are not flash, or flash without a charge pump acting like some kind of fake EPROM, that isn't ancient.
I don't know the answer to that.
I do know, from my own experience that the SiFive FE310 (TSMC 180nm) and FU540 (TSMC 28nm) both provide a few kb (8 and 16 respectively) of OTP, and no flash. They use external SPI flash for firmware.
Incidentally, the code in OTP (as delivered from the factory on the HiFive1 and HiFive Unleashed boards) starts with a RISC-V FENCE instruction 0x0000000F, which can be easily burned later to an up to ±1 MB relative jump 0xnnnnn06F.
I'd be interested to know the similar tricks that work on other ISAs.
Having all 0s be a NOP would be the simplest thing, but the RISC-V designers deliberately made both all 0s and all 1s be illegal instructions to try to detect execution off into the weeds ASAP.