Yeah, "just bypass the HAL" is a common refrain! I've stayed the HAL course for a couple of decades now and have too much proven-in-use collateral to abandon ship now. Particularly given the extraordinary complexity of the peripherals these days, and the desire to lean on vendors to ensure resistance to change stays low.
Yes, errors are cleared prior to initialisation. The HAL actually does a pretty good job of this, ensuring you don't get interrupts on initialisation. It's the period between initialisation and starting a receive that is no mans land. By then interrupts are already armed to fire, so it's too late to clear anything.
Yes, fortunately that UART-before-DMA initialisation bug has been fixed for a while. This seems to be a deliberate ordering of things, which is why I'm wondering if I've misinterpreted the interface.
FWIW, in the meantime I'm proceeding with the solution I hinted at: in CubeMX I configure all UARTs as Transmit Only. Then, immediately after starting a receive for the first time I manually flip the "receiver enable" bit. Feels a bit crude, but so far so good. Does what you would hope - the peripheral is configured with all the complex intra-peripheral settings you setup in CubeMX, and the Transmit Only setting doesn't seem to inhibit my options. But then no interrupts fire until after you make your first HAL_UART_Receive* call.
PS. Maybe not what you meant, but note the HAL is not a black box. The only reason I've stuck with it for so long is that it's human-readable source code! You can even step through it in the debugger, and I do so regularly. That's essential, and I've abandoned other vendors that have gone the black box route. Now is the intent a black box? Well technical support... varies.