We've got a board with a Xilinx Zynq SoC and a USB3320 transceiver. I'm not posting this in the FPGA forum because although the device is an FPGA, I think this is more general-purpose-embedded as bugs go as the USB peripheral is built into the hard ARM cores on the Zynq.
Our board is a custom hardware platform running Linux 5.10 (petalinux) using Zynq 7010 and 256MB DDR3. Boot is via SD card.
About 1 in 1000 boot cycles, the USB peripheral sometimes fails to initialise. In the kernel log we see "udc-core: couldn't find an available UDC or it's busy" when our script attempts to initialise the USB peripheral. Only a hard power cycle fixes the issue.
With extra debug turned on, I noticed that every failure like this is accompanied with this kernel log:
ULPI transceiver vendor/product ID 0x0424/0x0007
Found SMSC USB3320 ULPI transceiver.
ULPI integrity check: failed!
A successful boot, instead, has "ULPI integrity check: passed."
The origin of this message is here:
https://github.com/Xilinx/linux-xlnx/blob/master/drivers/usb/phy/phy-ulpi.c (in ulpi_check_integrity).
Now one thing that struck me about this is all the routine does is two byte read/writes. It doesn't retry anything. If there's a SEU, then the check will fail and the device won't be enumerated. But perhaps if our interface fails 1-in-1000 writes then we're not in SEU territory but more unreliable bus territory?
I did notice that when I manually write to this using some low level C code the failure rate increases with SDIO activity on our SD card, so that makes kind of perfect sense: bootup time involves a lot of SDIO read activity. However, looking at the board layout, the traces are quite far from one another, and the supply voltage to the devices appears stable throughout, so the correlation was not obvious to me.
I was considering getting the kernel engineer to patch this function and just try a few more times to see if ULPI works. Whilst running normally we don't see any particular issues with USB and can run for hours on end without fault.
Curious what others might think is going on here. I haven't yet probed the ULPI bus, that will be the next task.