I like the Tang 20K - I have a few of them - they’re great value in relative terms, and come with DDR3, not to mention a faster fabric (the pico-rv32 gets 42MHz on the 20K) and of course there’s a lot more space. The sticking point there is the price: by the time I’ve added the DIMM connector and an HDMI port, we’re at 2x that of the 9K. Relatively-speaking, still great, but still 2x the price.
It may come down to using the 20K, but I want to see if it can be squeezed into the 9K first. I don’t think there’s any chance of getting the logic I want inside the 4K.
As for what I’m doing, if you look at my comment
here you’ll get an idea. I have a small (~40ns) window of time to read a bus, do a lookup, and assert 1 or 2 signals. Alongside that, I want the data on the bus decoded and parsed so that embedded video fetches can be extracted and used to construct an HDMI framebuffer, I want an ethernet interface, and I want a chunk of RAM (8MB is fine) to be available to the host. There are other ancillary tasks, but those are the main things.
I did consider doing it with an MCU, but the requirements are extreme, I think it
could be done with an STM32H7{A|B}, running at 550MHz, with interrupt routines in ITCM for the bus monitoring and the LTDC to an ADV7511 to produce the display, quad-SPI or Hyper-ram for the memory, and the built-in ethernet controller (though that would limit me to 16-bit RGB)… but here we run into the price wall again.
The rapid-response needed for the hard timing-requirements for the bus-snooping coupled with the relatively high bandwidth requirements for video and megabytes of memory is a tough nut to crack when price is a factor. You either run the bus snooping on interrupts (which means fast interrupt response to hit that 40ns window) and run the rest of the code “normally”, or you get into time-management hell with everything that can happen being slaved to the 558ns-period external clock.
As for just using a hub, I don’t think that will work, will it ? AFAIK, hubs expect everything on the “down” links to be devices (or more hubs) and the “up” link to point (maybe eventually) to a host. In this case there are two connections (RP2040, FPGA) which are normally devices, so they could both go behind a hub, but if the RP2040 turns into a host so it can connect to the FPGA to program it over JTAG, it can’t sit on the device side of the hub and expect that to work, at least that’s my understanding. The point of the switches (one for the FPGA, one for the MCU) is to allow that path to work and provide the path from FPGA-as-device to RP2040-as-host.
[edit: having said all that, it does occur to me that the cost of all these USB switches, cables, connectors and hub might in fact be a significant chunk of the price difference between the 9K and the 20K tangs. There’s a lot more io pins on the 20K as well, so no games to play with time-multiplexing the bus snooping. I’ll have to run the figures and see if it’s just worth switching up]