Electronics > FPGA

OpenHBMC. Open-source AXI4-based HyperBus memory controller

(1/4) > >>

VGN:
Hello everyone!

I'm developing a high performance AXI4-based HyperBus memory controller for Xilinx 7-series FPGAs.
This memory controller is a small part of my another project: https://www.eevblog.com/forum/thermal-imaging/openirv-isc0901b0-(autoliv-nv3-flir-e4568)-based-opensource-thermal-camera/
Anyway, I thought that, probably, it is worth to publish it as a separate project.

This IP-core is packed for Vivado 2020.1 for easy block design integration, though you can use raw HDL files.

This is a first release. I haven't tested it well, but could successfully pass continues memory test at ~162MHz, i.e. 325MB/s on a custom devboad with Spartan-7 and a single HyperRAM. I'm going to beat the level of 200MHz (400MB/s), as soon as I get a board with HyperRAM capable 200MHz, as right now, the memory I have is limited by 166MHz. Soon I'm going to test long burst transfers on hardware, a DMA is needed for this, as Microblaze cannot initate long burst transfers.

Resource utilization: 565 LUT, 678 FF, 1.5 BRAM (1.5 x RAMB36E1 = 3 x RAMB18E1)

Feel free to ask question, criticize the design, report bugs and donate if you like this IP-core)

Link to repo: https://github.com/OVGN/OpenHBMC

KE5FX:
Good stuff.   :-+ I found it surprisingly hard to get my own HyperRAM state machine working.  The data sheets are pretty awful and the interface itself is more complicated than it has any right to be.  (What's up with all the fixed-versus-variable latency cruft, for instance... why can't I just poll RWDS to find out when the device is ready to read or write...?)

asmi:

--- Quote from: KE5FX on September 23, 2020, 06:50:30 am ---Good stuff.   :-+ I found it surprisingly hard to get my own HyperRAM state machine working.  The data sheets are pretty awful and the interface itself is more complicated than it has any right to be.  (What's up with all the fixed-versus-variable latency cruft, for instance... why can't I just poll RWDS to find out when the device is ready to read or write...?)

--- End quote ---
I found it very easy and documentation very good and helpful. Had no problems implementing interface. As for latency, by default the memory starts up in fixed double-latency mode, so you don't have to worry about it at all. This is also the only possible mode for dual-die 128M chips. The option to turn on variable latency is there in case you want to trade lower latency for increased controller complexity.
Fixed latency is also great when you want to scale your memory interface horizontally by using multiple chips in parallel lockstep, so effectively getting a multiple of bandwidth (2x for using 2 chips, 4x for 4, etc). This is necessary for example when you want to use these chips as a frame buffer - 720p@60Hz requires 1280*720*4*60~211 MB/s of bandwidth, and you will need at least double of that (so that you can read and write at the same time). In my design I used a pair of chips in parallel for 720p, for 1080p@60Hz (~475 MB/s) you will want 3 or 4 chips.
That said, I gotta say I prefer using LPDDR1 modules for smaller designs when using MIG is not an option due to its' rather high resource requirements. These chips are available in variety of capacities from 128Mbit to 2Gbit with x16 or x32 data bus, can go up to 200 MHz DDR at CL3, and it's protocol is quite simple to implement (though not as simple as hyperbus).

VGN:

--- Quote from: KE5FX on September 23, 2020, 06:50:30 am ---Good stuff.   :-+ I found it surprisingly hard to get my own HyperRAM state machine working.  The data sheets are pretty awful and the interface itself is more complicated than it has any right to be.  (What's up with all the fixed-versus-variable latency cruft, for instance... why can't I just poll RWDS to find out when the device is ready to read or write...?)

--- End quote ---
Thanks! Yes, the interface is a bit complicated, but not as much for me. Ha-ha, polling the RWDS to find out when the device is ready is the next generation DRAM, we should wait for this another 10 years))


--- Quote from: asmi on September 23, 2020, 02:09:41 pm ---Had no problems implementing interface.

--- End quote ---
It is very interested how did you solve the problem of data transfer from RWDS clock domain to internal FPGA clock?

asmi:

--- Quote from: VGN on September 23, 2020, 04:10:34 pm ---It is very interested how did you solve the problem of data transfer from RWDS clock domain to internal FPGA clock?

--- End quote ---
I used ISERDES in memory mode and fed rwds clock to CLK/CBLB pins of SERDES, while feeding free-running clocks from MCMM to OCLK/OCLKB/CLKDIV pins as described in UG471. I also considered using IN_FIFO to help close timing (it can act as 1:2 deserializer as well as regular FIFO, which is how it's used for DDR2/3 controllers), but for -2 device and 166 MHz this proved unnecessary. With this arrangement (no IN_FIFO) I was able to close timing at 200 MHz DDR for LPDDR1 x16 controller for -2 device by carefully choosing which IO pins to use for dq and dqs signals, but I suspect that for -1 device those FIFOs might be necessary. So I think it shouldn't be very hard to get new 200 MHz HB2 chips to work. Also note that my design used 2 64Mbit chips in parallel, so I had 2 parallel HB buses and so 2 separate dqs clocks.
I love how you went straight to the crux of the problem, it shows that you invested quite a bit of time getting this to work! Commendable :-+ :clap:

Navigation

[0] Message Index

[#] Next page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod