I think the statement was "ignored", because objectively its not true.
The most idiocracies I had when switching to an ARM CPU had more to do with the CPU instead.
For example, having to enable "power" or "clocks" to each peripheral before using it. Or having a 8-bit CPU nicely matches the smallest addressable data size in C, making it easy to program. A 32-bit CPU may introduce some gotcha's such as unaligned memory access and whatnot. Likewise, debugging HardFaults is also one of such things.
The longest bug I had to chase down took 1.5 weeks to find. It was on a LPC1768, but it had nothing to do with NXP's work: it was a bit we forget to manipulate within the ARM CPU core itself that led to stack problems.
etc.
And even if there are idicoracies with peripherals, in my experience its mostly to do with chip scaffolding (buses, memory, clocks, power, DMA) than individual peripherals. Its not like setting up a UART or SPI controller on STM32, LPC or PIC is going to make you jump through half a dozen hoops to make it work: a single pass through the peripheral registers is typically enough to make it to work. ST reuses their peripherals among chips and may add a feature here or there, but typically the "i dont care about this feature" => "keep the bit at zero" works, just like a PIC, AVR or any other sanely designed MCU.
Now that doesn't mean there are no issues with e.g. peripherals of a particular vendor. STM32s for example (used to) have a frustrating to use DMA controller. The peripheral requests could only be mapped to a particular DMA channel, which could lead to conflicts. This "firmware driver work" is in actuality hardware design, similar to how you would allocate peripherals depending on e.g. I/O pad locations. Similarly, some internal signals only have a handful of options on STM32, while a more generic solution would be needed.
Other vendors like SiLabs and Atmel have tried to find solutions for this with PRS and alike, which were easy to use, however, for some of these chips the silicon itself may not tick all the boxes. E.g. if you need industrial peripheral set (many CAN buses), then you probably skip SiLabs offerings. If you need ultra low power, then you probably wont find many suitable options from NXP.
There just isn't a way to "have it all".
All these "easy to configure" aspects are similar for STM32s, LPC's, SiLabs, etc. I've yet to see any vendor to do things drastically different.
And the question is whether thats even possible or desirable. As soon as a vendor says something is "smart", then I can translate also into the terms of non-transparent, frustrating and a can of worms. Similar to how a software ecosystem is a design lock-in, or a "prison".
The reason I think PICs and AVRs have none of these adaption issues is because those chips contain very very little hardware. There is only so much to go wrong with a handful registers for an UART! And also only so many things to use it for....
Regarding documentation: I think all vendors try to do their best, but at one way or another will fall short. They cannot possibly list all possible configurations of their chips. I've been looking at an (admittedly old:rev2) user manual of the LPC1768, and I'm not that impressed. The DMA controller chapter is only 29 pages long, of which 22 is register documentation (e.g. some auto generated tables with some bitfield annotations). The chapter says the DMA controller has support for scatter-gather using linked-lists, but then doesn't mention it once in the remainder 7 pages. Its basically undocumented as the register listing doesn't specify how to use it neither. Probably better to mail their sales/app team on it works.. but that requires human interactions. Eek.
Now STM32 documents can be on the opposite side of an enormous amount of redundant 'blabla' and then (forgetting to) reference content from half a dozen other places in a 2000+ page document. But at this point I also get the impression that using those chips 'bare metal' is the non-mainstream way: many people use ST's HAL and CubeMX software to generate driver initialization code.
Which one is better? Again: can't have it all, pick your poison.
My point is: in the end these 32-bit ARM chips have similar complexity, and some chips may have more quality-of-life features in the silicon (such as remappable I/O) and some in the datasheet. Which one you like to use more is subjective and that's fine. I think this statement however has been repeatedly posted by this user. Subjectively speaking, such statement cannot be false, but I don't think its true neither.