Author Topic: FPGA Video SDRAM (Read 7013 times)

electosleepy · « **on:** January 07, 2016, 05:21:57 am »

How do you determine the optimal word length for an external memory interface? For example I have seen that 256Mb SDRAM is available in the following word lengths.

256( 32M x 8 ) -->> (read as 32 million words that are 8 bits in length)
256( 16M x 16 )
256( 8M x 32 )

Say that the RAM is being used to store a video frame from a HDMI input which can be 24,30,36 or 48 bits per pixel resulting in 8, 10, 12, and 16 bits per RGB colour respectively. If a pixel contained 24 bits (8 bits for red, 8 bits for blue and 8 bits for green) would it be optimal to use the SDRAM with the 8 bit word length when compared to storing two 8 bit words in a SDRAM with a 16 bit word length? Is it equivalent?

If two 8 bit numbers are stored in the 16 bit memory address and the 1st 8 bit pixel value is changed but the second pixel remains unchanged does the whole 16 bit number have to be rewritten and thus it would be less efficient as opposed to changing one pixel in the 8 bit address? Simply put can the individual bits within a RAM address be changed or must the whole word be rewritten if changed?

marshallh · « **Reply #1 on:** January 07, 2016, 05:46:06 am »

With video you always want as much bandwidth as you can get.

So x32 it is, or several x8/x16 parts ganged up.

You can use data masking to selectively write bytes.
Also, you won't get enough bandwidth with just SDRAM. It's also becoming impractical to spec in parts. Aim for DDR1/2

helius · « **Reply #2 on:** January 07, 2016, 06:07:39 am »

Quote from: electosleepy on January 07, 2016, 05:21:57 am

How do you determine the optimal word length for an external memory interface? For example I have seen that 256Mb SDRAM is available in the following word lengths.

256( 32M x 8 ) -->> (read as 32 million words that are 8 bits in length)
256( 16M x 16 )
256( 8M x 32 )

These are the number of data signals brought out to the bus, but the RAM is accessed internally in larger parcels called lines. The access time will be best when a line is opened, then you read and/or write words from that line, then close it and open another line. How many bits are in each word is a parameter for pin allocation and routing but doesn't affect the RAM's latency. RAMs are approximately square, so the line length is about the sqrt of the capacity, or 16k in your case.

Quote

Say that the RAM is being used to store a video frame from a HDMI input which can be 24,30,36 or 48 bits per pixel resulting in 8, 10, 12, and 16 bits per RGB colour respectively. If a pixel contained 24 bits (8 bits for red, 8 bits for blue and 8 bits for green) would it be optimal to use the SDRAM with the 8 bit word length when compared to storing two 8 bit words in a SDRAM with a 16 bit word length? Is it equivalent?

It is not equivalent because 24 mod 8 = 0, but 24 mod 16 = 8. Whether that makes any difference depends on how you design the logic, it is not an insurmountable problem. You may need padding at the end of a pixel, or just at the end of each horizontal video line.

Quote

If two 8 bit numbers are stored in the 16 bit memory address and the 1st 8 bit pixel value is changed but the second pixel remains unchanged does the whole 16 bit number have to be rewritten and thus it would be less efficient as opposed to changing one pixel in the 8 bit address? Simply put can the individual bits within a RAM address be changed or must the whole word be rewritten if changed?

This is really a bus issue and not a RAM issue. If you perform a transaction over a bus of width m, there must be m data signals driven onto the bus at once. The meaning assigned to the signals may be different for different transaction types. For instance, if you are writing a single octet and the bus is 64 bit, use control signals that say "ignore D[8:63]". So your question is, does your SDRAM support partial word operations. Usually this is a feature of video RAMs, but is less common elsewhere.

electosleepy · « **Reply #3 on:** January 07, 2016, 09:35:06 am »

Quote

So x32 it is

Why do you suggest 32 bit word length? Is generally larger/faster RAM sizes only available with larger word lengths?

Quote

These are the number of data signals brought out to the bus, but the RAM is accessed internally in larger parcels called lines. The access time will be best when a line is opened, then you read and/or write words from that line, then close it and open another line.

Is a line equivalent to a bank? Just to clarify terminology.
www.alliancememory.com/pdf/dram/256M-AS4C16M16SA.pdf

Quote

It is not equivalent because 24 mod 8 = 0, but 24 mod 16 = 8. Whether that makes any difference depends on how you design the logic, it is not an insurmountable problem. You may need padding at the end of a pixel, or just at the end of each horizontal video line.

I can understand how it is not equivalent as you have 8 bits that are unused in the second 16 bit address if you are storing 24 bits. So for example if I stored a 640 x 480 frame with 24 bits per pixel in the RAM would the logical operation of combining two 8 bit pixels into a 16 bit number result in a significant processing overhead? I understand that it is a small operation but it has to be completed 230 400 times per 640 x 480 frame (((640 x 480 x 24) / 16) / 2). Is padding automatic and would it be quicker to store an 8 bit number into the 16 bit address even though this would waste memory?

Would it be best to choose a 16 bit word length to accommodate for the pixel depth as HDMI can support a pixel depth of 8, 10 12 and 16 bits? I do not think that my application will always be limited to an 8 bit pixel value.

hamster_nz · « **Reply #4 on:** January 07, 2016, 09:52:03 am »

Quote from: electosleepy on January 07, 2016, 09:35:06 am

Quote
So x32 it is
Why do you suggest 32 bit word length? Is generally larger/faster RAM sizes only available with larger word lengths?

An x32 chip will have four times the bandwidth of an x8 chip, for the same clock speed.

Quote from: electosleepy on January 07, 2016, 09:35:06 am

Is a line equivalent to a bank? Just to clarify terminology.

No, A SDRAM chip might have four banks - and the rows (lines) and columns are in each bank.

Quote

Would it be best to choose a 16 bit word length to accommodate for the pixel depth as HDMI can support a pixel depth of 8, 10 12 and 16 bits? I do not think that my application will always be limited to an 8 bit pixel value.

No. Take your max pixel data bandwidth (approx 450MB/s for 1080p, 24bpp) double it, and add a little bit more for overhead, and that is the minimum bandwidth needed to buffer the video stream - e.g. 1GB/s, or a little bit more. If you want to process that video in memory (rather than on the way in or on the way out) you will need to double that again. If you want to do 48bits per pixel, rather than 24, then you will need twice that again. If you want to overlay from SDRAM you will need another chunk of bandwidth.

Best bet - use 30 bit pixels internally, and pad or truncate incoming data to fit, and then use a x32 SDRAM chip, or two x16 parts. See how the numbers work out.

If that doesn't work out, use two different memory controllers, and ping-pong frames between them (write to one while reading from the other). That halves the memory bandwidth.

electosleepy · « **Reply #5 on:** January 08, 2016, 07:49:18 am »

I've been looking into DDR1 and I can see that it doubles the bandwidth when compared to SDRAM as it allows data to be transferred on the rising and falling edges of the clock cycle. I am having difficulty understanding how DDR2 is then able to double the bandwidth of DDR1 especially as to what the clock multiplier is in bandwidth calculations. Does it use two clock signals that are 180 degrees out of phase to double the data rate of DDR1?

https://en.wikipedia.org/wiki/Double_data_rate#/media/File:SDR_DDR_QDR.svg

Rasz · « **Reply #6 on:** January 08, 2016, 10:20:40 am »

Quote from: electosleepy on January 08, 2016, 07:49:18 am

I've been looking into DDR1 and I can see that it doubles the bandwidth when compared to SDRAM as it allows data to be transferred on the rising and falling edges of the clock cycle. I am having difficulty understanding how DDR2 is then able to double the bandwidth of DDR1 especially as to what the clock multiplier is in bandwidth calculations. Does it use two clock signals that are 180 degrees out of phase to double the data rate of DDR1?

https://en.wikipedia.org/wiki/Double_data_rate#/media/File:SDR_DDR_QDR.svg

its quadrupled in reference to real memory clock, not the bus clock. bus clock is always half, second doubling is inside the ram chip
all of this is done to hide/mask the fact RAM speed hasnt kept up over the last ~10 years and stagnates somewhere around 200mhz max
research latency, its pretty much the same between ddr2 ddr3 and ddr4, all a big scam

Scrts · « **Reply #7 on:** January 08, 2016, 07:22:26 pm »

Just for the reference: I would offer to avoid DDR1 and jump to DDR2, because it has on-chip termination on data bus. I remember adding a handful of resistors when doing DDR1 design...

marshallh · « **Reply #8 on:** January 08, 2016, 08:17:35 pm »

Quote from: Scrts on January 08, 2016, 07:22:26 pm

Just for the reference: I would offer to avoid DDR1 and jump to DDR2, because it has on-chip termination on data bus. I remember adding a handful of resistors when doing DDR1 design...

Not necessary when using devices directly point to point with slow speeds of an FPGA, even Micron reccomends this. But DDR2 is worth using just for availability/cost. It is a bit more annoying to deal with though.

electosleepy · « **Reply #9 on:** January 11, 2016, 04:44:55 am »

I would just like to say thanks to everyone that has contributed to this post so far it's been really helpful. So I've noticed that DDR is available with bus clock speeds of 100MHz, 133MHz, 166MHz and 200MHz while DDR2 is available at 200MHz, 266MHz, 333MHz and 400MHz so now it seems obvious as to why it is called DDR2 as the bus clock speeds have doubled, hence doubling the maximum theoretical bandwidth. I just had to see it in a table http://www.nxp.com/pages/ddr-memories-comparison-and-overview:784_LPBB_DDR.

So for everyone on this post ignore my previous post as it has nothing to do with QDR.

Quote

I've been looking into DDR1 and I can see that it doubles the bandwidth when compared to SDRAM as it allows data to be transferred on the rising and falling edges of the clock cycle. I am having difficulty understanding how DDR2 is then able to double the bandwidth of DDR1 especially as to what the clock multiplier is in bandwidth calculations. Does it use two clock signals that are 180 degrees out of phase to double the data rate of DDR1?

https://en.wikipedia.org/wiki/Double_data_rate#/media/File:SDR_DDR_QDR.svg

I was originally really confused by this equation from wiki (memory clock rate) × 2 (for bus clock multiplier) × 2 (for dual rate) × 64 (number of bits transferred) / 8 (number of bits/byte). To clarify the memory clock rate multiplied by the bus clock multiplier is equal to the frequency of the bus clock.

I've never implemented a memory interface before and would like to know are there application specific pins that I need to be aware of on the Artix 7 or can I use any GPIO pin? Is FIDELAYCTRL_REF the maximum switching frequency achievable by the I/O pins? See page 24

http://www.xilinx.com/support/documentation/data_sheets/ds181_Artix_7_Data_Sheet.pdf

hamster_nz · « **Reply #10 on:** January 11, 2016, 06:33:48 am »

If you use the Memory Interface Generator, which uses the hardened memory controller then your pins are pretty much decided for you. Have a good read of the Xilinx UG586 user guide (can't paste link while on phone sorry!).

You *really* do not want to be writing your own DDR2 controller....

Scrts · « **Reply #11 on:** January 11, 2016, 02:32:21 pm »

Quote from: hamster_nz on January 11, 2016, 06:33:48 am

If you use the Memory Interface Generator, which uses the hardened memory controller then your pins are pretty much decided for you. Have a good read of the Xilinx UG586 user guide (can't paste link while on phone sorry!).

You *really* do not want to be writing your own DDR2 controller....

Agreed. OP will mostly find DDR3 designs on development kits, but that's extremely difficult to route properly for a hobbyist. Even professionals have to do lots of simulation, including board level simulations for impedance discontinuities.

DDR2 is the way to go if you can solder BGA.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: FPGA Video SDRAM (Read 7013 times)

electosleepy

FPGA Video SDRAM

marshallh

Re: FPGA Video SDRAM

helius

Re: FPGA Video SDRAM

electosleepy

Re: FPGA Video SDRAM

hamster_nz

Re: FPGA Video SDRAM

electosleepy

Re: FPGA Video SDRAM

Rasz

Re: FPGA Video SDRAM

Scrts

Re: FPGA Video SDRAM

marshallh

Re: FPGA Video SDRAM

electosleepy

Re: FPGA Video SDRAM

hamster_nz

Re: FPGA Video SDRAM

Scrts

Re: FPGA Video SDRAM

Share me