Author Topic: Example of using Xilinx memory interface?  (Read 3843 times)

0 Members and 1 Guest are viewing this topic.

Online SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14464
  • Country: fr
Example of using Xilinx memory interface?
« on: January 23, 2019, 12:55:36 am »
Does anyone have a working example of using the Xilinx memory interface (MIG) to access DDR RAM, preferably in VHDL?

To be more precise, this would be on a Spartan 6 with DDR3 RAM.
I have read documentation on how to generate the MIG core (no problem with that), but using it is a bit more cryptic to me so far. I understood that you have to send read or write commands to "channels" that are buffered in FIFOs, and there is an example design that is generated when you generate the MIG core, but the example seems very crude and not that well written either.

So if anyone has used the "memory interface" and used it efficiently (meaning close to the max. possible throughput), that would help. I'd also need to be able to write and read to the same RAM chip from two different domain clocks, so I'd like to make sure Xilinx MI allows it.

Thanks for any pointers!
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2732
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #1 on: January 23, 2019, 04:34:32 am »
MIG generates example design together with the actual core that you will use with your design, which includes traffic generator to verify functionality - at least it does so in Vivado (folder example_design inside generated core's folder). For 7 series there is an excellent document UG586, which explains in great details how the core works and how to interface with it. I only worked with MIG through AXI4 interface, so I can't say if native interface is much different. The aforementioned example design can be synthesized and downloaded to the FPGA to verify core in hardware.
« Last Edit: January 23, 2019, 04:36:12 am by asmi »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Example of using Xilinx memory interface?
« Reply #2 on: January 23, 2019, 06:05:17 am »
http://hamsterworks.co.nz/mediawiki/index.php/MCB_Frame_Buffer

Performance depends on how you treat it. Big transactions vs small ones.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: SiliconWizard

Online SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14464
  • Country: fr
Re: Example of using Xilinx memory interface?
« Reply #3 on: January 23, 2019, 04:52:09 pm »
MIG generates example design together with the actual core that you will use with your design, which includes traffic generator to verify functionality - at least it does so in Vivado (folder example_design inside generated core's folder).

ISE does too, as I mentioned, but the quality of the example is not that great IMO. Maybe they have improved things with Vivado.

For 7 series there is an excellent document UG586, which explains in great details how the core works and how to interface with it.

At first sight, it looks like this document is better than the equivalent for the 6 series. I'll have a look.
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3146
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #4 on: January 23, 2019, 06:48:26 pm »
For 7 series there is an excellent document UG586, which explains in great details how the core works and how to interface with it.

At first sight, it looks like this document is better than the equivalent for the 6 series. I'll have a look.

7 series chips do not have dedicated memory controllers. Instead, the memory controller is built from generic blocks. MIG does use PHASER modules which are there only for DDR2/3 (and are not documented), but everything else is generic. In contrast, Spartan 6 has dedicated MCB controllers.
 

Online SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14464
  • Country: fr
Re: Example of using Xilinx memory interface?
« Reply #5 on: January 23, 2019, 07:07:29 pm »
Thanks for pointing this out. I haven't looked too much into the 7 series yet. That would explain significant differences in the documentation.

I don't know if the lack of MCB controllers in the 7 series is good news or not...
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2732
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #6 on: January 23, 2019, 11:05:04 pm »
I don't know if the lack of MCB controllers in the 7 series is good news or not...
Yes it is, as softcore provides a lot of flexibility. You get to choose which pins to use, can opt to save IO pins by hardwiring some DDR3 pins (like CS). This allows some quite interesting layouts like a single-bank 256Mx16 (512 Mbytes total) DDR3 interface, there are also almost endless possibilities for pin-swapping to make layout easier regardless of how you placed component(s).
Besides, 7 series devices are cheaper or equal than equivalent S6 ones, fabric and dedicated HW blocks are much faster, IDE is much more advanced and includes more stuff for free. Even the softcore MIG on a 7 series device is actually faster than S6 hardware MCB (you can implement 800 MT/s DDR3 interface even in the slowest speed grade Artix-7, while S6 can only do that at the highest speed grade). Basically short of using up old stocks of S6 devices, I see no reason to ever choose them over 7 series.

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2732
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #7 on: January 23, 2019, 11:16:51 pm »
ISE does too, as I mentioned, but the quality of the example is not that great IMO. Maybe they have improved things with Vivado.
It doesn't matter a whole lot, as you only need to figure out how to send commands to it and receive data. All dynamic memories heavily favor long bursts over short random access, so plan your memory transactions accordingly. You can utilize FIFO's programmable ALMOST_FULL/ALMOST_EMPTY flags which will guarantee that you have at least enough data/space for data for entire burst.
Since pretty much all IPs in Vivado use different variants of AXI4 bus, I took time to figure out how it works and now can easily interface with just about any IP using this protocol. And it's very simple too yet quite efficient!
I've just skimmed through UG388, and it seem to provide some guidance as to how to use it, but there is also UG416 that talks about AXI4 user interface. Since I know AXI4 that's what I would've chosen.

Online SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14464
  • Country: fr
Re: Example of using Xilinx memory interface?
« Reply #8 on: January 24, 2019, 12:29:25 am »
Thanks. With hamster's code and the added information you both gave, I've re-read the Spartan 6 mem. controller doc. and it's now a lot clearer to me.

I will definitely consider the 7 series. I have stuck to the 6 series so far for two main reasons: the TQFP packages which are easy to work with for hand soldering, a couple projects using Spartan 6's that are still maintained, and lastly I have quite a few Spartan 6 dev. boards laying around, some with DDR3 RAM.

I know the Spartan 7 is a bit cheaper, but there are still relatively few dev. boards for it compared to the impressive number of Spartan 6 boards still on the market for much lower prices.
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2732
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #9 on: January 24, 2019, 01:10:29 am »
I will definitely consider the 7 series. I have stuck to the 6 series so far for two main reasons: the TQFP packages which are easy to work with for hand soldering, a couple projects using Spartan 6's that are still maintained, and lastly I have quite a few Spartan 6 dev. boards laying around, some with DDR3 RAM.

I know the Spartan 7 is a bit cheaper, but there are still relatively few dev. boards for it compared to the impressive number of Spartan 6 boards still on the market for much lower prices.
S6s in TQFP packages don't have MCBs if my memory serves me. As for S7 devboards - it's understandable since this is a very new family, which only recently went into full production. But you can always build your own board, which is what I do. S7 are available in FBGA196 package which is very easy to work with as it was designed specifically to be fully broken out on cheap 4 layer boards. Since now services like JCLPCB offer very cheap 4 layer boards ($29 for 10 boards 10x10 cm), you can build just about any board you want. I personally prefer working with 6 layer boards as it makes layout much easier when working with larger packages, but I'm fairly confident that x16 DDR3 interface can be routed on just two signal layers as that's what I did several times on my boards (thou those boards were 6 layer, I only used two signal layers for actual memory routing). Since there is a lot of possible valid pin combinations, I tend to route DDR3 "backwards" - route actual traces first and then back-annotate pin assignments based on what I was able to route and verify pin assignments with MIG's pins validation.
« Last Edit: January 24, 2019, 01:20:49 am by asmi »
 

Online SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14464
  • Country: fr
Re: Example of using Xilinx memory interface?
« Reply #10 on: January 27, 2019, 05:40:51 pm »
I'll definitely consider doing that for future developments.

Anyway, I managed to get something working on a Spartan-6 dev board with LPDDR RAM (512Mb, 16-bit). I wrote a test in which I first write a 4MBytes block (sequential) at the maximum rate I can (using the max 64 burst length and 32-bit port width), then read back the same block also as fast as possible and checking each read word. And then back to block write, etc. Works fine. There is no concurrent read or write during each phase.

The LPDDR chip is rated for 166MHz max, I used 150MHz for this test. I measured the total write time and read time (toggling I/Os). I get approx. 80% of the max. bandwidth (which would be 600MBytes/s) for the block write, and 70% for the block read. Not bad, but I don't know if one can get better, and I don't know whether it's normal that reading is a bit slower than writing. Granted I'm already happy to get 480MBytes/s write and 430MBytes/s read at my first try. ;D

What's your experience with max. throughput using DDR? Are there more favorable settings, such as maybe limiting the burst length (as unintuitive as it may sound) or using wider user port width (64 or 128-bit)? Does LPDDR have worst performance than say classic DDR2 at the same clock speed?

Last thought is that the sligthly degraded performance could also come from the signal routing, as there is a calibration phase. The board is a Numato Saturn V3. The routing doesn't seem too bad, but it's a small board and it looks pretty crowded.
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2732
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #11 on: January 28, 2019, 01:27:08 am »
I'm not sure about LPDDR, but for DDR3 memory DQS calibration and write leveling is a standard procedure upon startup and so not something to worry about. As for signal integrity issues - they will show up as incorrect data as DDRX physical protocol has no error detection nor correction. Since you've seen none of them - you should be good. In general it's more of an issue with DDR3 memories as previous versions are just not fast enough, and even with DDR3 it's quite easy to counter with termination.
For bus width - check MIG docs to find out what the "native" port width is, as if you use any other width there will be some additional latency for converting. It shouldn't affect the bandwidth though.
If you don't mind posting the source code of the component that's actually talking to MIG, we might be able to see if there's anything wrong with it. Honestly I've never bothered to measure maximum memory bandwidth that's possible to utilize as DDR3L x16 400 MHz interface I typically use has far higher BW than I ever needed. But now I'm kinda curious and maybe I will create a test for that :D

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3146
  • Country: ca
Re: Example of using Xilinx memory interface?
« Reply #12 on: January 29, 2019, 09:45:01 pm »
What's your experience with max. throughput using DDR? Are there more favorable settings, such as maybe limiting the burst length (as unintuitive as it may sound) or using wider user port width (64 or 128-bit)? Does LPDDR have worst performance than say classic DDR2 at the same clock speed?

These interfaces are similar, so the principle is the same. In theory, you can get continuous read (or continuous write). Memory consists of banks, so while one bank is activated/deactivated the other one could be read/written to.  This creates continuity. The only exception is that you have to pause for refresh. In practice, it all depends on how the controller does it. Anyway, 70% or 80% is not the maximum.

You can check how the address is formed (if you can change that) - starting from LSB, there should be columns, then banks, then rows. Such arrangement should make the consecutive reads (or consecutive writes) possible, but there still will be refresh pauses.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf