Sniffing the host-hub traffic with an LS device attached would be even nicer
Yeah, that would be the use case. I assumed not a lot of people would be doing host/hub implementations, especially capable of multiple speeds.
As far as I understand now, the PIO supports small programs per pin for fast responsiveness.
They are not really per pin, a subset of pins available to each PIO, but the ways they can use them are pretty limited.
the PIO0 program seems to do NRZI decoding
PIO0 does most of the work. It does not actually remove NRZI, but it handles re-synchronization to bit edges. This is likely unnecessary in practice, but I already had the code, so I left it there.
and the PIO1 program seems to detect EOP's.
PIO1 was a later addition. This is an attempt to synchronize capture to the stream. Otherwise if you start the capture in the middle of the packet, the whole thing will get messed up. PIO0 can only track its state if it started from the idle line state.
PIO1 only runs once at the begging looking for any EOP. The line is assumed to be in the idle state after that, so it signals PIO00 to start running.
The buffer seems to get filled with raw capture data in lines 581-609. Is that understanding correct?
Yes, this is the main loop that pulls the data from the PIO.
Not sure I currently understand the format of the data capture buffer, either in raw format or processed format. Is that documented somewhere?
It is just raw bits. separated by the size word. PIO only captures 31 bits at a time, so that bit 32 remains 0 for normal data. And the size (in bits) is calculated by subtracting 1 from 0xffffffff for each bit. At the end of the packet that value is also pushed in the stream. But this value will always have bit 32 set to 1.
Now you can scan the buffer and find the first word with the MSB set, and you know the exact number of bits in the preceding data stream.
The parsing happens in the display.c.