I'm being "seduced" with an assignment: device which does pass-through real time processing of up to 4K 60fps video stream, among other things. I would like to ask more experienced people for advice.
And I'm researching how doable it is in Artix/low Kintex zone and whether should I try it or incontinently
engage contingency bailout protocol due to avalanching costs and complexity.
I had some experience with Altera devices sth like 9 years ago, so I do not consider myself a total noob; I just dind't follow FPGA world for +7 years at all, and I'm not really familiar with current hardware capabilities; started to catch up recently. I also did some offline CV work.
I have already found out that 4k60fps is... non-trivial. So I prolly should stick to max 1080p 60fps full rbg or sth and enjoy implementing the system with actual margins as well as relearning the wolkflow.
I consider two boards:
- Nexys Video (top Artix 7)
- Genesys 2 (some "-2" Kintex 7)
Seems both boards can handle 1080/60 over "HDMI". Still prefer Genesys2. Worth it?
Some details on project: the processing involves a bunch of different series and parallel pixel-by-pixel algorithms, convolutions, compositing, and a lot of other DSP spewed all over. Propably some frame buffering would be needed too for certain bursty stuff. But I really want to avoid buffering the entire stream in external memory, I expect it to be a choke, am I right? No stream audio processing, no external frame syncing.
I'm developing the DSP and demos with opencv, with offline video, with mind of it being as FPGA friendly as possible. The thing would be fed by graphic card and/or pricey cameras (yes, two sinks, one source).
Now I have some general FPGA/video questions:
1. How should I handle syncing multiple parallel pipelines? Example:
I split the rx video stream onto Short Pipeline (several stages) block and onto Long Pipeline (many stages) block. I need to sync the pipelines outputs to blend them pixel-by-pixel. And in the end, I need to sync this blended stream with untouched audio stream and pump them both into video tx. Should I pad the shorter stream with FIFO structures (either in fabric or in external memory)? But how can I deduct the padding FIFOs depth, as I can't tell from HDL how long the Long Pipeline is (nor I would not count flipflops in output)?
[Late thought] That is not a good example but on point with at least audio stream... ultimately I could just move the parallel paths and compositing into serial path, leaving very few syncing points (audio).. and the latency is not super relevant... Hmmm.
2. Are these AXI-Stream cores relevant? I mean, are these good/helpful/usable/worthy (I have such an impression)? Or is it sth like STM32 HAL: marketing imposed buggy bloatware busting timings and resources?
3. How should I handle processing clock vs AXI Stream pixel clock? Should I cross domain them over FIFO or sth? Or just sample the stream with processing clock and play the Valid signal? I see that I can't really pump processing clock much higher than pixel clock.
4. How much Xiling charges for the HDMI/DP cores licenses? Is it permanent, in 1k$ range? Or is it another usual "suck 10k$ up annually or gtfo"?
5. Is HDCP encryption a thing? I mean can the venerable Sony camera or graphic card refuse to cooperate upon discovery of how much I despise the HDMI overlords? No, I am not going to sink Netflix or similar crap.
Bonus: Vivado is really slow at loading different viewports and other things... is that a feature? Quite a shame for a C++ software.
I still would like to ask in advance around about 4K viability.
1. External hardware:
Both boards have HDMIs hooked to ios and can do only DVI-D, thus I cannot even spawn the Xilinx HDMI rx/tx core. So it seems I'd have to cobble another board with HDMI tx/rx interface ics and slap that through FMC connector to the actual GTs, am I right?
Another annoyance: there is tiny a footnote in Xilinx HDMI rx/tx ip UG: the HDMI 2.0 ip cannot be spawned on Artix and Kintex "-1" devices. Hence Genesys 2 seems to be a winner here. But how trappy can it turn out? Like, can these IPs just in themselves eat majority of timing/placement margins, even on "-2" Kintex?
Nexys has just 1 DisplayPort, kinda useless, but at least 2 boards seems to have DP lanes connected to GTs. Tho I would prefer leaning towards HDMI. Ideally I would like to have both HDMI and DP sinks/sources.
2. Can Kintex handle the 4k60fps max rgb stream anyway, assuming the correctly handled HDMI 2.0 or DP 1.2 (~13-15 Gbps)? Like, how can I tell what pixel clock frequency the HDMI/DP ip stream can output in function of negociated data format? Or gauge such capabilities in any other way, except of, well, building target design?
[Late thought] I wonder if the source always streams the audio, or it depends on EDID queries... [inquired] So it seems I actually can tell I don't support audio. Wondering if it actually thinns the downlink.
Btw can the Genesys 2 board handle any quality 4K60 on its DPs? Could't find the confirmation.
3. Is USB 3.2 Gen-2-whatever a viable solution in place of HDMI/DP here? Not sure how the camera outputs video to PC over USB. Does it tunnel the video mangled with prioprietary codec over the prioprietary USB class into the proprietary driver? Or simply uses the USB physical lines, like DP can? Yes, I'm trying to shoo away from USB ideas with "buy yourself the adaptors, leave me alone".
If you don't mind, I'll maybe come with more queries... sorry if I ask for something plainly obvious, I'm already drowning in that quite vast lake of knowledge.