For what it is worth, I blundered into a very interesting yet cheap way to fix the problems at the hardware level.
WCH CH32V203C8T6 is a Qingke V4B RISC-V MCU in LQFP48, which has two full-speed USB interfaces, but costs less than $1€ in singles at LCSC (and Aliexpress). It does not require a crystal; a 3.3V regulator and two 100nF X7R supply bypass capacitors suffice. However, it does have four UARTs, two SPI, and two I2C. See Stefan Wagner's
CH32V203 F6P6 development board as an example, one using the F6P6/TSSOP20 variant.
On the C8T6, one can use both USBs, I2C1 (SCL on 45/PB8 and SDA on 46/PB9, alternate configuration), USART2 and USART4 (RX, TX, CTS, RTS), SPI1 (NSS, SCK, MOSI, MISO), and either SPI2 (NSS, SCK, MOSI, MISO) and I2C2 (SDA, SCL), or USART3 (RX, TX, CTS, RTS), at the same time.
The idea is simple: you have a small device exposing aforementioned pins. One USB device (USBD) is the bulk data interface. The other USB device (USBFS) is an optional one, and provides a terminal console (using standard ANSI escape codes) where one can interactively configure and monitor the bulk data interface, for example using Minicom/PuTTY etc. You don't need to use the console interface, unless you wish to configure the bulk interface, but the configuration interface could for example make certain things, like say baud rate, to some fixed value, having the bulk interface lie to the OS. Useful in cases where you like a specific terminal software, but it has issues setting all the settings the way you like, or the driver does things you don't like. I'd add an I2C EEPROM like M24C01-RMN6P or M24C08-RMN6TP for about ten cents, to store the configuration without wearing down the Flash on the MCU. In all, I calculate the BOM cost to be less than 2 USD/EUR.
In Linux of course, one can do the same thing just using different CDC ACM endpoints and udev rules to identify them. Here, the two USB devices can have different Vendor:Product and/or Serial, identifying which is which on all operating systems, even Windows, when still using the generic/class CDC ACM drivers.
Two separate USB devices also makes for interesting
shenanigans, since each USB is limited to about 1 Mbytes/sec data throughput. For example, for a custom bulk SPI stuff, one could use one USB for outgoing data, and the other for incoming data, for sustained continuous SPI transfers at about 8 MHz, effectively doubling the full-speed USB bandwidth. Also, resetting/reconfiguring the bulk data interface for a different set of endpoints won't reset the configuration interface, because they are separate USB devices.