Author Topic: DDR3 initialization sequence issue  (Read 62276 times)

0 Members and 1 Guest are viewing this topic.

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3136
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #100 on: June 08, 2021, 12:33:40 pm »
Besides ODDR2 primitive , what other primitives do I need in this case ?

If your design can run at the DDR3 clock speed then ODDR, IDDR and IDELAY is all you need (not counting clock generation).

If not, you'll need OSERDES and ISERDES for IO.

 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #101 on: June 09, 2021, 02:26:37 am »
For IODELAY2 primitive,  how to generate 90 degree phase shift on the incoming DQ data bits using integer range from 0 to 255 for the IDELAY_VALUE attribute ?

 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3136
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #102 on: June 09, 2021, 03:09:04 am »
It is complicated. Xilinx has an app note which explains how to use SERDES and calibrate delays in Spartan-6: xapp1064.

 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #103 on: June 09, 2021, 06:23:05 am »
In xapp1064 ,

1) Why there is a master and a slave ?

2) Should I do DDR Data Reception using methods in Figure 5 or Figure 6 ?




 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3136
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #104 on: June 09, 2021, 01:51:58 pm »
ug381 explains these. Masters are associated with P pins. Slaves are associated with N pins. For differential signals they work together.

I don't think you can use DQS strobes to run PLL.

DQS may be able to work as a clock, but you would need to transfer the results to a continuous clock domain somehow.

Spartan-6 has MCBs - special hardware blocks for DDR memory. I don't know what they do and how they work. But because they exist, doing it on your own is not the mainstream approach. Therefore you need to be creative.
 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #105 on: June 09, 2021, 03:07:54 pm »
It seems that https://www.xilinx.com/support/documentation/white_papers/wp249.pdf#page=5 explains better.  Could you comment about the DPA state machine to handle dynamic skew/jitter ?

As for IODELAY2 primitive, I have some confusion on the delay calculation equation.

I did some calculation using TTAP8 and n=7 , but could not get the result of 13.57nS

Besides, how shall I make use of the bitslip function existed in ISERDES primitive ?



 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3136
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #106 on: June 09, 2021, 04:00:44 pm »
It seems that https://www.xilinx.com/support/documentation/white_papers/wp249.pdf#page=5 explains better.  Could you comment about the DPA state machine to handle dynamic skew/jitter ?

DQ lines are synchronized to DQS by design. They will take different paths within FPGA, so you need to find an appropriate delay. Once you find the appropriate delay, they're very unlikely to dessynchronize because they move in the same direction and therefore temperature and voltage variations will affect them both roughly the same.

The relationship between DQS and your internal clock will depend on the round-trip delay, so it will vary. I noticed roughly 50-100 ps variations with SODIMM as the chip gets warmer. But the synchronization between them doesn't need to be perfect.

IMHO, if you find good delay values once, you can use them indefinitely. So, you can do the calibration once during startup, or you can do it separately and remember the suitable delay values somewhere.

As for IODELAY2 primitive, I have some confusion on the delay calculation equation.

I did some calculation using TTAP8 and n=7 , but could not get the result of 13.57nS

Looks correct to me:

424/8 * 256 = 13.57 ns

Besides, how shall I make use of the bitslip function existed in ISERDES primitive ?

Bitslip is used when the data you receive is off by one bit (or several bits). I don't think you need it for DDR3 because you know precisely when the transmission starts, so you simply bring the ISERDES out of reset at the exact moment.
 
The following users thanked this post: promach

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #107 on: June 09, 2021, 09:05:37 pm »
Bitslip is used when the data you receive is off by one bit (or several bits). I don't think you need it for DDR3 because you know precisely when the transmission starts, so you simply bring the ISERDES out of reset at the exact moment.

My DDR3 controller read-logic operates on a separate adjustable PLL phase, using the DQS coming from the DDR3s to re-align and determine the starting byte position based on the preamble and continuous 0101 pattern.  I have an allowable adjustable window region of +/- 1 CK.  The reason I need that range is the 'precise-start' is not guaranteed based on the length of tracks between DDR3 and FPGA, plus the power-up tuning of my reference RDQ_CK sampling PLL.

The story is different you are using the DQS input as your actual capture clock.

Pros and cons: Using DQS as your read clock may more easily allow higher speed transfers above 400MHz, but your FPGA will require dedicated DQS circuitry tied to dedicated DQ lanes.

Sampling the DQS in parallel with DQ, the method I am using, relegates me to slower speeds, or having 1-4 DDR ram on PCB, or 1 SODIM module, but, I can now run DDR3 on Cyclone III/IV which only have a single unbalanced DQS IO since I can access the DDR3 DQS line as regular DQ lines in software simulated differential.  The newer Cyclone V,10,MAX 10 have differential DQS lines designed for DDR3&4.  My solution also works on these FPGAs as well, no change in code...
« Last Edit: June 09, 2021, 09:22:50 pm by BrianHG »
 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #108 on: June 10, 2021, 01:27:43 am »
Quote
I have an allowable adjustable window region of +/- 1 CK.  The reason I need that range is the 'precise-start' is not guaranteed based on the length of tracks between DDR3 and FPGA, plus the power-up tuning of my reference RDQ_CK sampling PLL.

The story is different you are using the DQS input as your actual capture clock.

What do you mean by RDQ_CK sampling PLL ?  Are you not using DQS as your actual capture clock ?
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #109 on: June 10, 2021, 02:59:50 am »
Quote
I have an allowable adjustable window region of +/- 1 CK.  The reason I need that range is the 'precise-start' is not guaranteed based on the length of tracks between DDR3 and FPGA, plus the power-up tuning of my reference RDQ_CK sampling PLL.

The story is different you are using the DQS input as your actual capture clock.

What do you mean by RDQ_CK sampling PLL ?  Are you not using DQS as your actual capture clock ?
No.  What is going on is since the read DQS comes in parallel with the DQ and my outgoing CK clock,  I have a 1 tunable PLL output tuned to the optimum read phase using the MPR System Read Calibration (see figure 59) during power-up.  With this setup, your chosen FPGA does not require the use of dedicated DQS circuitry, just the use of any DDR IO for the DQS pins as well as DQS pins still being compatible.  (Even DQ Groups may be ignored, however, I still recommend wiring them properly)  The con is that I need 3 PLL outputs to run my system.  1 for CK and logic, 1 at 90 degree phase for generating the writing DQ & DQM outputs, and 1 tunable PLL output for the read sampling.  So long as your FPGA has simple DDR or SERDES IOs can handle at least 600mbps, my controller will work on hardware and properly simulate with any FPGA vendor's DDR IOBUF ip without resorting to special simulation bypass code.

The only con is that the length of your CK, DQS, DQ & DQM need to be matched.  This means 1 or 2 DDR3 ICs.  You can get away with 4 wired in 1 row (so long as the CK is routed to the middle of the ram chips), use a laptop SODIM module, with 4 on top, 4 underneath with the CK routed from the center, single or dual rank.  My controller can also output multiple CK pairs if you want support for more DDR3, or to place 2 DDR3 on one side of the FPGA and another 2 on the other side.

Without write levelization, I cannot guarantee single or multiple PC memory memory modules with 8 or 16 ram chips.
 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #110 on: June 10, 2021, 03:43:45 am »
Since you have done MPR calibration during initial power-up, may I know why would you need "1 tunable PLL output for the read sampling" ?

May I also know how do you phase shift your incoming DQ data bits such that it is sampled at its middle by incoming DQS strobe ?
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #111 on: June 10, 2021, 04:11:05 am »
Since you have done MPR calibration during initial power-up, may I know why would you need "1 tunable PLL output for the read sampling" ?

May I also know how do you phase shift your incoming DQ data bits such that it is sampled at its middle by incoming DQS strobe ?
I am not using the DQS strobe as a clock for sampling the DQ.  I'm using the DQS inputs as a 'data_enable' DDR input where the 'preamble' is used as a sync/reset read buffer position.  Remember, when reading data, the DQS is in perfect sync with the read DQ.

The tunable PLL output clock goes to the 'input clock' for the DQ & DQS DDR input buffers and subsequent read data FIFO's input clock.

The PLL has the following optional inputs:

Phase_select,                (Selects which one of the many outputs of a single PLL which you may wish to adjust the phase)
Phase_step_enable,       (Steps the selected PLL output's phase by 1/16th to 1/64th of the PLL's reference clock output, IE 64 steps will shift the selected output by a perfect 360 degrees.)
Phase_direction,          ( Step left or right.)

I'm only using 1 PLL, it just that I have 3 outputs enabled and I am adjusting the phase of clk #2 while I set the parameter that clk #1 is at 90 degrees and clk #0 is my reference 0 degree and it's the system clock.  (So far, the power-up default phase 0 has always been chosen.  I doubt it would move until you have the memory a few inches away from the FPGA, or you are going through a connector to a memory module.  Then, I only expect the phase to move by 1-2 steps to the right.)

All Altera, Xilinx, & Lattice PLL have the same feature to step adjust in real time each of their PLLs multiple outputs individually with just 3 control signals.

My DDR3 controller (coming soon) is fully vetted and fully functional on real hardware.  It's currently running on Arrow's 37$ DECA board seen here: https://www.eevblog.com/forum/fpga/arrow-deca-max-10-board-for-$37/msg3453256/#msg3453256

For playing, it is well worth the 37$ as it has so much on it including a 150$ MAX 10 FPGA with 512MB DDR3 ram, and a shit load of peripherals like Ethernet and HDMI, with demo code for running each.
« Last Edit: June 10, 2021, 04:39:20 am by BrianHG »
 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #112 on: June 10, 2021, 06:58:40 am »
Quote
The tunable PLL output clock goes to the 'input clock' for the DQ & DQS DDR input buffers and subsequent read data FIFO's input clock.

The PLL has the following optional inputs:

Phase_select,                (Selects which one of the many outputs of a single PLL which you may wish to adjust the phase)
Phase_step_enable,       (Steps the selected PLL output's phase by 1/16th to 1/64th of the PLL's reference clock output, IE 64 steps will shift the selected output by a perfect 360 degrees.)
Phase_direction,          ( Step left or right.)

The issue is that Xilinx ISE clock wizard coregen for PLL does not actually have the capability to do the above quoted phase stepping.  Please correct me if wrong.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #113 on: June 10, 2021, 07:15:17 am »
Ever hear of Google?

Anyways, see here:
https://www.google.com/search?client=firefox-b-e&q=Xilinx+ISE+PLL+clock+phase+stepping

Follow the links...
Get to the document XAPP888....
Go to table 18 as an example.
The clock wizard probably can generate for you a default set of settings and you just address the controls you want to dynamically change.

The difference between Xilinx and Altera is for Xilinx, you provide an integer number for the divide from the main PLL oscillator frequency and an integer for the phase and duty cycle instead of a step & direction.  (Though, Altera also has full PLL reconfiguration which boils down to these same controls...)  Lattice also re-configures just like Xilinx, just different address locations for the settings and different integers.
 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #114 on: June 10, 2021, 07:20:38 am »
XAPP888 had removed support for Xilinx ISE since year 2014.

I had also confirmed that MMCM IP is not available inside ISE coregen.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #115 on: June 10, 2021, 07:29:28 am »
What about the following inputs for the PLL core:

Code: [Select]
PSCLK    Input  Dynamic Phase Shift Clock: Clock for use in dynamic phase shifting.
PSEN     Input  Dynamic Phase Shift Enable: Starts a dynamic phase shift transaction.
PSINCDEC Input  Dynamic Phase Shift increment/decrement:When ’1’; increments the phase shift of the output clock, when ’0’, decrements the phase shift.
PSDONE   Output Dynamic Phase Shift Done: Completes a dynamic phase shift transaction.

(LOL, Dead identical to Altera other than the 'exact' name of each IO port...)

I don't think Xilinx would remove a fundamental feature of any PLL technology.
Read the latest user-guides on the their spartan 6 / 7 PLL's functions.
« Last Edit: June 10, 2021, 07:31:17 am by BrianHG »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #116 on: June 10, 2021, 07:34:25 am »
Did you try enabling 'Dynamic Phase Shift' in the clock generation tool? ? ?
It's on the FIRST PAGE OF THE SETUP WIZARD!

The IOs I mentioned are right there in your documentation...

Time to read...
« Last Edit: June 10, 2021, 07:38:22 am by BrianHG »
 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #117 on: June 10, 2021, 07:38:04 am »
I enabled Dynamic Phase Shift Ports inside ISE clocking wizard coregen, however I have the following issues about unsupported frequencies marked as XXX in the table:

 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #118 on: June 10, 2021, 07:41:52 am »
305 doesn't divide evenly into your source clock, that is unless you are feeding the PLL a 305MHz crystal.
Try 400MHz, or 350MHz, or 300MHz, or 500MHz.  (All easily compatible with a 50MHz source clock)

You need to read on the limitations of the PLL.
« Last Edit: June 10, 2021, 07:46:38 am by BrianHG »
 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #119 on: June 10, 2021, 07:59:22 am »
100MHz, 200MHz, 300MHz, 400MHz, 500MHz are all not supported when Dynamic Phase Shift is enabled.

Note: I am using 50MHz source clock
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 7638
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #120 on: June 10, 2021, 08:00:30 am »
You will need to ask a Xilinx user why.
 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #121 on: June 11, 2021, 02:06:25 am »
Someone told me that dynamic phase shift and clock multiplication (DFS) are mutual exclusive DCM options.

 

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #122 on: June 11, 2021, 08:43:51 am »
@NorthGuy

Quote
Bitslip is used when the data you receive is off by one bit (or several bits). I don't think you need it for DDR3 because you know precisely when the transmission starts, so you simply bring the ISERDES out of reset at the exact moment.

Do I really need PLL if I already had ISERDES ?
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3136
  • Country: ca
Re: DDR3 initialization sequence issue
« Reply #123 on: June 11, 2021, 03:01:21 pm »
@NorthGuy

Quote
Bitslip is used when the data you receive is off by one bit (or several bits). I don't think you need it for DDR3 because you know precisely when the transmission starts, so you simply bring the ISERDES out of reset at the exact moment.

Do I really need PLL if I already had ISERDES ?

ISERDES will need a clock. You can either bring an external clock from DQS and route it through BUFIO, or generate it with PLL/DCM and rounte it through internal clock buffers.

You will need to adjust the clock phase somehow.

For reading, you can use DQS to sample DQs (the canonical way), or you can use a clock generated by DCM (the BrianHG's way). Either way, you need to produce a phase shift.

The phase shift may be accomplished by IODELAY or by DCM.

For DQS, you can apply IODELAY to the DQS input. Obviously, you cannot use DCM to phase shift DQS.

For BrianHG's method, you certainly can use DCM phase shift mechanism. Routing the clock through IODELAY is another possibility. I don't know if you can route the internal clock through IODELAY with Sparatn-6. I know that you can do this with 7-series.

You do not need to calibrate dynamically. You can calibrate once, then hard-code the phase shift/delay into your design. This will work for one board only - moving to a different board will require re-calibration. But you can have a separate design for calibration only. Such design will produce the numbers which you plug into your main design.
 
The following users thanked this post: promach

Offline promachTopic starter

  • Frequent Contributor
  • **
  • Posts: 875
  • Country: us
Re: DDR3 initialization sequence issue
« Reply #124 on: June 11, 2021, 03:14:04 pm »
For page 5 and 17 of https://www.xilinx.com/support/documentation/application_notes/xapp1064.pdf , I have few questions.

1) In https://github.com/mithro/soft-utmi/blob/master/hdl/third_party/XAPP1064-serdes-macros/Verilog_Source/Macros/serdes_1_to_n_data_ddr_s8_diff.v#L216-L256 and Figure 6, may I know how does this phase detection state machine works ? 
and which portion of the rest of the code belongs to calibration state machine ?

2) For figure 18, what do "USE_DOUBLER=TRUE"  and  "I_INVERT=TRUE"  mean ?  What is the purpose of the dotted line labelled as "Serdes Strobe" ?

3) For Figure 6, how does the signal "User BITSLIP" work for data reception ?  In the upper block, why is "Master IDELAY" connected to two-inputs "BUFIO2_2CLK"  ?

4) How do I turn on DDR mode for ISERDES primitive ?  and how is this different from IDDR primitive ?



 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf