Electronics > FPGA

Video pass-through Xilinx based device, general thoughts/queries

(1/4) > >>

I'm being "seduced" with an assignment: device which does pass-through real time processing of up to 4K 60fps video stream, among other things. I would like to ask more experienced people for advice.
And I'm researching how doable it is in Artix/low Kintex zone and whether should I try it or incontinently engage contingency bailout protocol due to avalanching costs and complexity.  :scared:

I had some experience with Altera devices sth like 9 years ago, so I do not consider myself a total noob; I just dind't follow FPGA world for +7 years at all, and I'm not really familiar with current hardware capabilities; started to catch up recently. I also did some offline CV work.

I have already found out that 4k60fps is... non-trivial. So I prolly should stick to max 1080p 60fps full rbg or sth and enjoy implementing the system with actual margins as well as relearning the wolkflow.
I consider two boards:
- Nexys Video (top Artix 7)
- Genesys 2 (some "-2" Kintex 7)
Seems both boards can handle 1080/60 over "HDMI". Still prefer Genesys2. Worth it?

Some details on project: the processing involves a bunch of different series and parallel pixel-by-pixel algorithms, convolutions, compositing, and a lot of other DSP spewed all over. Propably some frame buffering would be needed too for certain bursty stuff. But I really want to avoid buffering the entire stream in external memory, I expect it to be a choke, am I right? No stream audio processing, no external frame syncing.
I'm developing the DSP and demos with opencv, with offline video, with mind of it being as FPGA friendly as possible. The thing would be fed by graphic card and/or pricey cameras (yes, two sinks, one source).

Now I have some general FPGA/video questions:
1. How should I handle syncing multiple parallel pipelines? Example:
    I split the rx video stream onto Short Pipeline (several stages) block and onto Long Pipeline (many stages) block. I need to sync the pipelines outputs to blend them pixel-by-pixel. And in the end, I need to sync this blended stream with untouched audio stream and pump them both into video tx. Should I pad the shorter stream with FIFO structures (either in fabric or in external memory)? But how can I deduct the padding FIFOs depth, as I can't tell from HDL how long the Long Pipeline is (nor I would not count flipflops in output)?
    [Late thought] That is not a good example but on point with at least audio stream... ultimately I could just move the parallel paths and compositing into serial path, leaving very few syncing points (audio).. and the latency is not super relevant... Hmmm.

2. Are these AXI-Stream cores relevant? I mean, are these good/helpful/usable/worthy (I have such an impression)? Or is it sth like STM32 HAL: marketing imposed buggy bloatware busting timings and resources?

3. How should I handle processing clock vs AXI Stream pixel clock? Should I cross domain them over FIFO or sth? Or just sample the stream with processing clock and play the Valid signal? I see that I can't really pump processing clock much higher than pixel clock.

4. How much Xiling charges for the HDMI/DP cores licenses? Is it permanent, in 1k$ range? Or is it another usual "suck 10k$ up annually or gtfo"?

5. Is HDCP encryption a thing? I mean can the venerable Sony camera or graphic card refuse to cooperate upon discovery of how much I despise the HDMI overlords? No, I am not going to sink Netflix or similar crap.

Bonus: Vivado is really slow at loading different viewports and other things... is that a feature? Quite a shame for a C++ software.

I still would like to ask in advance around about 4K viability.

1. External hardware:
Both boards have HDMIs hooked to ios and can do only DVI-D, thus I cannot even spawn the Xilinx HDMI rx/tx core. So it seems I'd have to cobble another board with HDMI tx/rx interface ics and slap that through FMC connector to the actual GTs, am I right?

Another annoyance: there is tiny a footnote in Xilinx HDMI rx/tx ip UG: the HDMI 2.0 ip cannot be spawned on Artix and Kintex "-1" devices. Hence Genesys 2 seems to be a winner here. But how trappy can it turn out? Like, can these IPs just in themselves eat majority of timing/placement margins, even on "-2" Kintex?

Nexys has just 1 DisplayPort, kinda useless, but at least 2 boards seems to have DP lanes connected to GTs. Tho I would prefer leaning towards HDMI. Ideally I would like to have both HDMI and DP sinks/sources.

2. Can Kintex handle the 4k60fps max rgb stream anyway, assuming the correctly handled HDMI 2.0 or DP 1.2 (~13-15 Gbps)? Like, how can I tell what pixel clock frequency the HDMI/DP ip stream can output in function of negociated data format? Or gauge such capabilities in any other way, except of, well, building target design?
[Late thought] I wonder if the source always streams the audio, or it depends on EDID queries... [inquired] So it seems I actually can tell I don't support audio. Wondering if it actually thinns the downlink.
Btw can the Genesys 2 board handle any quality 4K60 on its DPs? Could't find the confirmation.

3. Is USB 3.2 Gen-2-whatever a viable solution in place of HDMI/DP here? Not sure how the camera outputs video to PC over USB. Does it tunnel the video mangled with prioprietary codec over the prioprietary USB class into the proprietary driver? Or simply uses the USB physical lines, like DP can? Yes, I'm trying to shoo away from USB ideas with "buy yourself the adaptors, leave me alone".

If you don't mind, I'll maybe come with more queries... sorry if I ask for something plainly obvious, I'm already drowning in that quite vast lake of knowledge.

I would certainly looks past Nexus Video and onto Genesys 2. It's got fully-populated DP input and output (Nexys only connects 2 DP lanes) connected to transceivers which can do 10 Gbps each, so you should be able to go up to DP HBR3 (8.1 Gbps per lane), which is more than enough for 4k@60. It's also got 1 GB of DDR3 running at up to 900 MHz with 32 bit data bus, so 900 * 2 * 32 = 56.25 Gbps, which should be enough for double-buffering of 4k@60 stream (writing and reading in parallel).

I will tell you that I didn't try 4k@60 on that board (because I don't have 4k monitor over here, and none of my clients asked for it yet), but 1080p@60 is easy-peasy, even with complicated processing and composition.

There is one catch with this board - the FPGA that on a board (K325T-2) is NOT included in free WebPack license. So the board comes with a voucher for license that is device-locked to K325 (it covers any package). Thankfully, unlike Altera's licensing model, Xilinx licenses are version-limited as opposed to time-limited, meaning if let your license lapse, you can still use the last version of IDE which was out at the time of expiry forever. Actually this board is the cheapest way to get your hands on a license for K325.

Finally - all of those free AXI IPs are genuinely very useful, but you can probably write more efficient HDL yourself because those IPs have to be useable for a multitude of scenarios, while your own HDL can be very task-specific. Unfortunately, DisplayPort TX/RX IPs are not free, but you can still generate a trial license and use them for prototyping and/or hardware checkout, with intention to either buy a license, or design your own equivalent modules. It's not very complicated, but you have to keep in mind that you can only get a DP 1.2 specification for free (so HBR2 is a limit - which is still enough for 4k@60), if you want anything beyond that, you will have to part with some cash.

As for USB 3.2 - Xilinx MGTs don't support it officially, but I seem to recall that there is a third-party IP which does. I think using USB for video is not the best idea because it doesn't guarantee bandwidth, and in my experience the kind of bandwidth you can actually get heavily depends on many factors outside your control - like if user connects your board directly into PC USB port, or through hub, or cable extender. If anything, I would sooner consider using 10G Ethernet for that (Genesys 2 can support it via FMC addon board - either commercial, or designed by you).

And final two cents - first of all, you can take Vivado, request trial versions of DP RX and TX and try fitting it into K160 device (with SG 2 and in FFV package - that's important!), which is available in free WebPack license. This way you will see what kind of resources you will need and in general estimate feasibility before you commit to spending $1k on Genesys 2 board.

Another thing - if your intend to do some OpenCV processing, you might want to try Vitis HLS because it supports some subset of OpenCV, so you can enjoy writing your code in C/C++ as opposed to HDL.

Do you have access to the HDMI standards? The later (4k capable) ones are very hard to find unless you pay $$. The higher speed formats are very different to the older 1.X protocol.

I've got a Nexys Video, and it can only do DisplayPort at 2.7Gb/s per channel. Only two channels are used on the mini DisplayPort, so that's 540MB/s tops, and RGB 444 1080p60 is 447MB/s. The Gensys2 is a better option for DP. The HDMI on the Nexys Video is connected to the standard SELECTIO SERDES, so in theory limited to < 1.25Gb/s, so that is 125GHz Pixel clocks if you use 444 RGB (or YCC).

Also, note that the Gensys2 Vivado license will version-lock you to the version and host you install it on. You can upgrade Vivado for about a year until you are stranded on old releases node-locked to old hardware. I would only use it on Linux, where the license can be moved along with my PCI NIC card as the PC is replaced.


--- Quote from: hamster_nz on April 26, 2021, 04:26:55 am ---Also, note that the Gensys2 Vivado license will version-lock you to the version and host you install it on. You can upgrade Vivado for about a year until you are stranded on old releases node-locked to old hardware. I would only use it on Linux, where the license can be moved along with my PCI NIC card as the PC is replaced.

--- End quote ---
You can easily change MAC address in Windows. No need to move anything.


[0] Message Index

[#] Next page

There was an error while thanking
Go to full version