Author Topic: Why does this dds code fail (Solved)  (Read 6336 times)

0 Members and 1 Guest are viewing this topic.

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Why does this dds code fail (Solved)
« on: March 17, 2022, 10:15:36 am »
I'm playing with verilog, to which I'm new, and an Anlogic AL3-10 device. Using the Tang Dynasty IDE release 5.0.3 and run into strange behavior of which I'm not sure if it is my verilog or the IDE.

Trying to make a DDS with two 14 bit DAC's. For first testing I made a sawtooth generator and when running on a fixed counter increment it works for both channels, as long as there is a power of 2 relation between the two channels. When I set step values that have a non power of 2 relation it either kills the first channel or the frequency is incorrect.

The code I wrote for this:
Code: [Select]
//---------------------------------------------------------------------------
//Main module for connections with the outside world

module FA201_Lichee_nano
(
  //Input signals
  input wire i_xtal,       //50 MHz clock

  //Output signals
  output wire o_dac1_clk,
  output wire o_dac1_wrt,
  output wire o_dac2_clk,
  output wire o_dac2_wrt,

  output wire [13:0] o_dac1_d,
  output wire [13:0] o_dac2_d
);

  //---------------------------------------------------------------------------
  //Internal wire 
  wire core_clock;
 
  wire [31:0] channel1_signal_step;
  wire [31:0] channel2_signal_step;

  //---------------------------------------------------------------------------
  //Connection with the sub modules
 
  pll_clock pll 
  ( 
    .refclk   (i_xtal),
    .reset    (1'b0),
    .clk0_out (core_clock)
  );
 
  awg dac1
  (
     .i_main_clock           (core_clock),
     .i_signal_step          (32'h400000),
     .o_dac_clk              (o_dac1_clk),
     .o_dac_wrt              (o_dac1_wrt),
     .o_dac_d                (o_dac1_d)
  );

  awg dac2
  (
     .i_main_clock           (core_clock),
     .i_signal_step          (32'h1723549),
     .o_dac_clk              (o_dac2_clk),
     .o_dac_wrt              (o_dac2_wrt),
     .o_dac_d                (o_dac2_d)
  );

endmodule

//---------------------------------------------------------------------------

Code: [Select]
//----------------------------------------------------------------------------------
//Module for generating the DAC signals

module awg
(
  //Input
  input i_main_clock,
   
  input [31:0] i_signal_step,

  //Output
  output [13:0] o_dac_d,
 
  output o_dac_clk,
  output o_dac_wrt
);

  //--------------------------------------------------------------------------------
  //Registers

  reg clock;
  reg write;
 
  reg [35:0] signal_phase;

  //--------------------------------------------------------------------------------
  //Logic
 
  always@(posedge i_main_clock)
    begin
      clock <= ~clock;
    end
   
  always@(posedge i_main_clock)
    begin
      write <= ~write;
    end
 
  always@(negedge i_main_clock)
    begin 
      if(write == 1'b1)   
        signal_phase <= signal_phase + i_signal_step;         
      else
        signal_phase <= signal_phase;
    end
   
  //--------------------------------------------------------------------------------
  //Connect

  assign o_dac_d = signal_phase[35:22];
  assign o_dac_clk = clock;
  assign o_dac_wrt = write;

endmodule

//----------------------------------------------------------------------------------

When I set the same value for both step values it works with both outputting the same frequency. As long as there is a power of 2 relation between the two it works correctly. I just tried it with decimal 4194304 for channel 1 and decimal 12582912 for channel 2. A factor 3 between the two. The result is 30.4KHz on channel 1 and 22.8KHz on channel 2. The 22.8KHz is correct. The frequency on channel 1 should have been 7.6KHz

So why is this happening?
« Last Edit: April 07, 2022, 09:25:03 am by pcprogrammer »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #1 on: March 17, 2022, 11:47:50 am »
Did you check your compilation timing report to see if your FPGA can achieve the FMAX clock rate.  Using different clock adder values may lead the compiler to remove significant bits in your :
Code: [Select]
signal_phase <= signal_phase + i_signal_step;  logic hence allowing a higher FMAX.  Maybe some odd values for 'i_signal_step' requires all 36 bits in your adder and your FPGA cannot achieve the required FMAX messing up the output frequency.

Remember, always adding by a fixed 65536 means the compiler will ignore the first 16bits in your:
Code: [Select]
signal_phase <= signal_phase + i_signal_step;  code which will now become only a 20 bit adder.  A 20bit adder will have a much higher FMAX than a full 36bit adder.  This is done because the bottom 16bit will never change and it is easier to drop them from the generated gates in the calculation.

Extra but not necessarily relevant:
No power-up reset, or power-up default for the counter & output clock & write regs?

This usually doesn't affect FPGAs, especially with Quartus as they will power-up default to 0, but I do know it can affect simulations in Modelsim.
« Last Edit: March 17, 2022, 11:49:46 am by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #2 on: March 17, 2022, 12:27:29 pm »
Thanks for the reply. I'm new to modern FPGA's. Last time I did something with FPGA's was some 25 years ago.

So no, I did not look at timing reports. Was not aware of them. I do understand that a 20 bit adder will be faster then a 36 bit adder and that the compiler can optimize. But a test with these two values:

step 1: 32'h1723549
step 2: 32'h17235490

gave correct output frequencies for both channels. Channel 2 16x higher then channel 1.

For step 1 the full 36 bits have to be there. That makes me believe the timing would be ok.

As I'm new to verilog I thought the code might be wrong somehow :o

I do see some warnings about timing constraints. Not sure if this is a problem, because I also get them when it works :-//
Code: [Select]
TMR-5009 WARNING: No clock constraint on 2 clock net(s):
core_clock
i_xtal_pad

You are right about there not being a reset in the code. I don't think it to be a problem for this testing. In the final system I plan to put some form of reset into it.

Edit: I'm also trying to do simulations in ModelSim, but have problems in getting things working. It fails on the PLL in the design. Some library issue |O
« Last Edit: March 17, 2022, 12:29:18 pm by pcprogrammer »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #3 on: March 17, 2022, 12:58:23 pm »

I do see some warnings about timing constraints. Not sure if this is a problem, because I also get them when it works :-//
Code: [Select]
TMR-5009 WARNING: No clock constraint on 2 clock net(s):
core_clock
i_xtal_pad


Yes, without a timing constraint, your FPGA core each time you compile will generate a core with any part able to run at any frequency from slow to fast.  In Altera/Intel's Quartus, we would need to generate and set a .SDC 'Synopsys Design Constraints' file to tell the compiler at which clock rates the CLK inputs are running at and IO timing for all the IOs defining what parts of the FPGA we are attempting to run the FPGA at.  The most important will be the source clocks so that at least the FPGA internals will function correctly with your source clock unless your design is really slow.  The result in the compiler compiler report will tell you if all the timing requirements were met.

If you are not ready to touch FPGA timing, then just set your PLL output to 1/4 speed.  Then re-compile-run your design and see if the output are correct, except working at 1/4 speed.

2 minor tricks to improve FMAX with your existing code: (Level A) change-
Code: [Select]
  always@(negedge i_main_clock)
    begin
      if(write == 1'b1)                // You might want to change this to 1'b0 to shift the output result.
        signal_phase <= signal_phase + i_signal_step;         
      else
        signal_phase <= signal_phase;
    end

to:
Code: [Select]
  always@(posedge i_main_clock)  // Always use 1 clock polarity everywhere to achieve the best FMAX
    begin
      if(write == 1'b1)   
        signal_phase <= signal_phase + i_signal_step;         
     /* else
        signal_phase <= signal_phase;    This may slow down FMAX with some compilers and it is not required in your code */
    end

 (Level B) change-
Code: [Select]
  //Connect

  assign o_dac_d = signal_phase[35:22];
  assign o_dac_clk = clock;
  assign o_dac_wrt = write;
to:
Code: [Select]
module awg
(
  //Input
  input i_main_clock,
   
  input [31:0] i_signal_step,

  //Output
  output reg [13:0] o_dac_d,  // This is now a register
 
  output reg o_dac_clk,  // This is now a register
  output reg o_dac_wrt  // This is now a register
);

.....
  //Connect
always@(posedge i_main_clock)  // Always use 1 clock polarity and separate the IO pins
    begin                      // from the faster core FPGA fabric logic by adding a 1 clock D-Reg latch delay
  o_dac_d <= signal_phase[35:22];
  o_dac_clk <= clock;
  o_dac_wrt <= write;
end

These additions are not a huge improvement unless you have a lot of other code in the FPGA or my above 2 tactics will really help with marginal designs or weird pin assignments which cross more than 1 IO banks.

There are ways to really improve FMAX, but such tactics are for achieving things like 300MHz 36bit adders with slower FPGAs.



As for Modelsim, I can only help your with Altera's version with it's included libraries as I have plenty of use and examples with them.
« Last Edit: March 17, 2022, 01:14:46 pm by BrianHG »
 
The following users thanked this post: pcprogrammer

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #4 on: March 17, 2022, 01:40:07 pm »
The timing constraints is a bit of a mystery to me. There is a timing wizard in the IDE but it is not clear on what needs to be set :palm:

I made this with it:
Code: [Select]
create_clock -name core_clock -period 4 -waveform {0 2}
create_clock -name i_xtal -period 20 -waveform {0 10}
create_clock -name i_xtal_pad -period 20 -waveform {0 10}
set_clock_latency  -source 1 [get_clocks {i_xtal}]

But it still complains about the two clocks not having constraints. Even when I add set_clock_latency lines for them.

I see your point in using the same edge of the clock for better performance, but there is some benefit of the phase difference in the actions on the data. The DAC takes in the data on the rising edge of the write signal, which is synchronized to the rising edge of the main clock. Having both the write and the signal phase change on the rising edge of the clock might cause timing issues since the write signal is used in the decision to update the signal phase. Using the phase difference gives 2ns of room for the signals to be stable. (PLL is on 250MHz)

Your level B change is interesting. Does this translate in the FPGA to use the registers in the IO blocks, or will it use additional registers in the logic blocks? In case of the IO blocks your comment about separating the IO from the faster core logic makes sense.

I have to experiment with all of this. But that is what learning is about :)

Well you are right that it is a timing issue. I removed the PLL and just clock things on the 50MHz crystal clock and it does what it is supposed to do with the factor 3 ratio between the two frequencies, as well as with the other fractional relation. Now 1/5th lower in frequency but it works.
« Last Edit: March 17, 2022, 01:59:26 pm by pcprogrammer »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #5 on: March 17, 2022, 01:51:43 pm »
I see your point in using the same edge of the clock for better performance, but there is some benefit of the phase difference in the actions on the data. The DAC takes in the data on the rising edge of the write signal, which is synchronized to the rising edge of the main clock. Having both the write and the signal phase change on the rising edge of the clock might cause timing issues since the write signal is used in the decision to update the signal phase. Using the phase difference gives 2ns of room for the signals to be stable. (PLL is on 250MHz)
There is nothing but loss in using 2 different edge clocks in the way you have done the 2 different clock edges.

I knew it.  250MHz is a tall order for your 36bit full adder.  You need a really fast FPGA to do this, something like a 300$ TO 500$ fpga especially as your add enable is on a different phase means you FPGA needs to operate as if it needs a 500Mhz 36bit adder.

This is not the way to clock your dac or design as you should be running the entire design only at 125MHz and feeding just the dac clk outputs with a 250MHz clk with inverting it's way along at the phase you desire.  A 125MHz 36bit full adder is much more likely to meet timing requirements for the majority of FPGAs out there.
« Last Edit: March 17, 2022, 01:56:25 pm by BrianHG »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #6 on: March 17, 2022, 02:01:04 pm »
Your level B change is interesting. Does this translate in the FPGA to use the registers in the IO blocks, or will it use additional registers in the logic blocks? In case of the IO blocks your comment about separating the IO from the faster core logic makes sense.

All this does is place dumb logic latches at the IO pin logic cells.
In your original code, depending on the compiler or it's settings, the compiler may try to place the adder itself on the IO pin logic cells.  This may save space for tiny PLD/FPGAs, but usually negatively impacts FMAX as those IO logic cells may not route the needed mux/gates to perform the add itself where you need them if your IO pin definitions have the IO in the non optimum locations.
 
The following users thanked this post: pcprogrammer

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #7 on: March 17, 2022, 02:14:19 pm »
I see that this high speed electronics world is a whole different playing field. Sure signal delays, clock skew and what you have was also in play 25 years ago, but on 10 or 20MHz the problems where not that big. 6502 or Z80 cpu's running on 2 or 4MHz with 100ns memory no problems. Now connecting your scope probe can make a difference between working and not working. Lots to learn again.


Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #8 on: March 17, 2022, 02:26:09 pm »
The timing constraints is a bit of a mystery to me. There is a timing wizard in the IDE but it is not clear on what needs to be set :palm:

I made this with it:
Code: [Select]
create_clock -name core_clock -period 4 -waveform {0 2}
create_clock -name i_xtal -period 20 -waveform {0 10}
create_clock -name i_xtal_pad -period 20 -waveform {0 10}
set_clock_latency  -source 1 [get_clocks {i_xtal}]

Is there a setting/field in the compiler setup to point to your 'source' .sdc file containing the above code so that it knows that it is to be used?  Note that some compilers may also support or require an 'include in the source verilog for some SDC files.  Altera Quartus has a field in the compiler menu settings where you list the .sdc files before they are recognized / utilized.
« Last Edit: March 17, 2022, 02:29:11 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #9 on: March 17, 2022, 02:36:35 pm »
Yes the IDE has a separate constraints section. It also holds the IO constraints.

When I remove the .sdc file from the project I get this warning:
Code: [Select]
TMR-5001 WARNING: No sdc constraints found while initiating timer.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #10 on: March 17, 2022, 02:48:46 pm »
A snip from one of my projects:
Code: [Select]
#**************************************************************
# Create Clock
#**************************************************************
create_clock -period "10.0 MHz" [get_ports ADC_CLK_10]
create_clock -period "50.0 MHz" [get_ports MAX10_CLK1_50]
create_clock -period "50.0 MHz" [get_ports MAX10_CLK2_50]

create_clock -period "1.0 MHz"  [get_nets {I2C_HDMI_Config:u_I2C_HDMI_Config|mI2C_CTRL_CLK}]

The difference between 'get_ports' and 'get_nets' is that when I have the 'get_ports', this must point to a net with a matching net name on an IO pin while the other is for a net somewhere in my design which has generated a clock through logic.

As for the "50.0 MHz" after the -period, if I didn't have the MHz, then it would default to nanoseconds.
Without the -waveform, the compiler assumes 50/50 duty cycle.
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #11 on: March 17, 2022, 03:01:35 pm »
Thanks for your input. It is much appreciated.

The timing constraints is something I have to do some reading on.

I modified my code with your suggestions and with a bit of thinking of my own it is now working on the 125MHz the DAC can take.

Code: [Select]
//---------------------------------------------------------------------------
//Main module for connections with the outside world

module FA201_Lichee_nano
(
  //Input signals
  input wire i_xtal,       //50 MHz clock

  //Output signals
  output wire o_dac1_clk,
  output wire o_dac1_wrt,
  output wire o_dac2_clk,
  output wire o_dac2_wrt,

  output wire [13:0] o_dac1_d,
  output wire [13:0] o_dac2_d
);

  //---------------------------------------------------------------------------
  //Internal wire 
  wire core_clock; 
  wire dac_clock;
 
  //---------------------------------------------------------------------------
  //Connection with the sub modules
 
  pll_clock pll 
  ( 
    .refclk   (i_xtal),
    .reset    (1'b0),
    .clk0_out (core_clock),   
    .clk1_out (dac_clock)
  );
 
  awg dac1
  (
     .i_main_clock           (core_clock),
     .i_signal_step          (32'h1723549),
     .o_dac_d                (o_dac1_d)
  );

  awg dac2
  (
     .i_main_clock           (core_clock),
     .i_signal_step          (32'h17235490),
     .o_dac_d                (o_dac2_d)
  );
 
  //--------------------------------------------------------------------------- 
  //Connections to external world 
 
  assign o_dac1_clk = dac_clock;
  assign o_dac1_wrt = dac_clock;
  assign o_dac2_clk = dac_clock;
  assign o_dac2_wrt = dac_clock;

endmodule

//---------------------------------------------------------------------------

Code: [Select]
//----------------------------------------------------------------------------------
//Module for generating the DAC signals

module awg
(
  //Input
  input i_main_clock,
   
  input [31:0] i_signal_step,

  //Output
  output reg [13:0] o_dac_d
);

  //--------------------------------------------------------------------------------
  //Registers

  reg [35:0] signal_phase;

  //--------------------------------------------------------------------------------
  //Logic
 
  always@(posedge i_main_clock)
    begin 
      signal_phase <= signal_phase + i_signal_step;         
    end
   
  always@(posedge i_main_clock)
    begin 
      o_dac_d <= signal_phase[35:22];
    end

endmodule

//----------------------------------------------------------------------------------

I modified the PLL to run on 125MHz and provide two clock outputs with a 90 degree phase shift. The first clock is used for the phase signal increment and the other for the external DAC clock signals.

There are some spikes in the DAC output I have to examine, but at least the frequencies are correct.

Now I have to see if it will work with input from a MCU.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #12 on: March 17, 2022, 03:38:19 pm »
Thanks for your input. It is much appreciated.
...
There are some spikes in the DAC output I have to examine, but at least the frequencies are correct.

Now I have to see if it will work with input from a MCU.

 :-+

For the DAC clock, you should have a second PLL output at 125MHz tied to the IO pins directly and on the second PLL output, tune it's phase to 90 degree, or 45, or 0.  Now you can directly tune the DAC clk output timing relative to the data output.  Ok, you already did that...

Also, if the compiler supports this attribute keyword, use it:

Code: [Select]
  //Output
 (* useioff = 1 *) output reg [13:0] o_dac_d
This tells the compiler that that output reg should be forced onto the IO pin's registers.

Short of defining the input and output delay in the .sdc file, this is a quick way to get fast clean parallel output buss.
« Last Edit: March 17, 2022, 03:44:03 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #13 on: March 17, 2022, 04:13:52 pm »
Thanks again. Really helpful.

With the code I get the following warnings:
Code: [Select]
TMR-6022 WARNING: Net o_dac2_wrt_pad cannot map all the sources/sinks to wire.
PHY-5011 WARNING: x0y0_pll_clkc0: 2666: is dangling

Searched with google for the dangling thing but could not find something useful.

The code is working so the clocks are driving the logic and IO, so I wonder what the "cannot map all the sources/sinks to wire" is about?

The spikes might be caused in the hardware. 125MHz on these wires (see photo) is maybe a bit much :-DD But hey it is just experimental hobby.
Measurements with a logic analyzer without the DAC module did show a lot of spikes on the signals. ( second picture) Signal D15 was also on channel1, which shows a lot of noise on the signal.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #14 on: March 17, 2022, 06:59:54 pm »
Thanks again. Really helpful.

With the code I get the following warnings:
Code: [Select]
TMR-6022 WARNING: Net o_dac2_wrt_pad cannot map all the sources/sinks to wire.
PHY-5011 WARNING: x0y0_pll_clkc0: 2666: is dangling


You might not be allowed to tie the PLL output clock to too many IOs in this fashion.

Also, what is 'o_dac2_wrt' ?  Maybe this signal should stay high?

Another thing, it may be better to make your second PLL output run at 250 MHz and run this code to generate your DAC clocks:

Code: [Select]
module (
input clk_125,  // feed in 125MHz system clock from PLL
input clk_250,  // feed in 250MHz system clock from PLL
(* useioff = 1 *)(*preserve*) output reg dac_1_clk, // force preserve a logic cell at the IO pin to feed this clock
(* useioff = 1 *)(*preserve*) output reg dac_2_clk  // without the preserve, the FPGA compiler simplify by wiring 1 logic cell output to 2 IOs generating 1 clock ahead of the other.
)
parameter bit inv_clk_phase  = 0;

reg clk_half = 0;
reg clk_half_buffer = 0;
reg clk_full_buffer1 = 0 ;
reg clk_full_buffer2 = 0 ;
reg clk_full_out ;

always @(posedge clk_125) begin
clk_half <= !clk_half ; // Generate a 62.5MHz clock in phase with the 125MHz dac data.
end

always @(posedge clk_250) begin
clk_full_buffer1 <= clk_half ;
clk_full_buffer2 <= clk_full_buffer1 ;
clk_full_out     <= clk_full_buffer1 ^ reg clk_full_buffer2 ; // Generate a single 250MHz pulse once every 'clk_half' toggle.

dac_1_clk <=  inv_clk_phase ^ clk_full_out ; //  Copy the 'clk_full_out' to an IO buffer with a parameter option to invert it.
dac_2_clk <=  inv_clk_phase ^ clk_full_out ;
end

endmodule

This should make the IO's logic cell flipflop drive the 125MHz output clock instead of the FPGA fabric's global clock being tied to the IO pin through a fuse.


I'm confused, are the spikes you are complaining about the data bits shown on your logic analyzer?
Or is the dac actually showing glitched on the output?

Have you tried configuring the current drive for the FPGA IOs?
Have you tried changing the second PLL clock output's phase?

Note that with my code, the 2 clock phases should work with 0 degree, you would just change the 'inv_clk_phase' parameter.
Modifying the 250MHz clk phase should just be a last resort.

(Yes, 125 MHz data through that bundle of wires will be messy.  I recommend at least separating out of the bundle the CLK wires since D0's maximum toggle rate is 62.5MHz which isn't as bad as the 125MHz signals which may bleed into all the other traces.)
« Last Edit: March 17, 2022, 07:22:03 pm by BrianHG »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #15 on: March 17, 2022, 07:34:04 pm »
I see that this high speed electronics world is a whole different playing field. Sure signal delays, clock skew and what you have was also in play 25 years ago, but on 10 or 20MHz the problems where not that big. 6502 or Z80 cpu's running on 2 or 4MHz with 100ns memory no problems. Now connecting your scope probe can make a difference between working and not working. Lots to learn again.
25 years?  Well, 21 years ago, Altera released their Apex II FPGA, the forerunner to they Cyclone I/II/III/IV FPGA.  It has close to the same IO speeds and would have run your current 125MHz 2 channel DDS code at the same rate as today.  In fact, it could pull off a 250MHz version.

Yes, I did make a video sampling and playback card on an Apex II chip with a 108MHz video DAC.
« Last Edit: March 17, 2022, 07:42:27 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #16 on: March 17, 2022, 08:25:39 pm »
Also, what is 'o_dac2_wrt' ?  Maybe this signal should stay high?

The DAC is an AD9767 and it uses the WRT signal to latch the data in a buffer and the CLK signal to latch the data from the buffer into the DAC. According to the datasheet they can have the same phase. So it needs to be actively clocked.

I will try the setup for it with the 250MHz clock you provided.

I'm confused, are the spikes you are complaining about the data bits shown on your logic analyzer?
Or is the dac actually showing glitched on the output?

The spikes I mentioned are in the actual DAC output. Not sure about the cause. Used my Hantek scope since it is smaller and starts faster then the Rigol. It is not as good, To get to the bottom I probably have to get my Rigol or my Yokogawa out on the desk.

The spikes I saw on the logic analyzer are caused by the noise on the signals. They change when I move the threshold on the analyzer, and by the looks of it they were not in line with the actual write edge, so most likely no bit error on the DAC of these. It certainly has to do with the wires. Tried the code on an Altera Cyclone IV board with the logic analyzer probe directly hooked onto the header pins, so no long wires, and I had way less glitches. Noticed that they appeared when most of the lines toggled from level.

Have you tried configuring the current drive for the FPGA IOs?
Have you tried changing the second PLL clock output's phase?

I did play a bit with pullups/pulldowns and skewrate and drive strength but it did not seem to make a difference on the logic analyzer and the first test with the 125MHz ~7KHz output on the DAC (free running 14 bits counter) showed a glitch free output of the DAC.

Did not play with the second PLL clock phase yet.

It is very interesting to play with

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #17 on: March 17, 2022, 08:34:21 pm »
I see that this high speed electronics world is a whole different playing field. Sure signal delays, clock skew and what you have was also in play 25 years ago, but on 10 or 20MHz the problems where not that big. 6502 or Z80 cpu's running on 2 or 4MHz with 100ns memory no problems. Now connecting your scope probe can make a difference between working and not working. Lots to learn again.
25 years?  Well, 21 years ago, Altera released their Apex II FPGA, the forerunner to they Cyclone I/II/III/IV FPGA.  It has close to the same IO speeds and would have run your current 125MHz 2 channel DDS code at the same rate as today.  In fact, it could pull off a 250MHz version.

Yes, I did make a video sampling and playback card on an Apex II chip with a 108MHz video DAC.

At least 25 years ago. I started in 1989 or 1990 with the smaller Xilinx XC3000 devices. Later on the XC4000 series also came available but were still expensive compared to the XC3000 series. The last thing I made with a low power version of the XC3042 was a smart card reader connected to and powered by a RS232 port on a PC. This was around 1997. After that I drifted of into software development.

Not bad that the forerunner of the Cyclone series was already capable of pulling something like this off.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #18 on: March 17, 2022, 08:37:47 pm »
Ohhh.  I do not like the way that digital interface works with 4 clocks.  Well, I guess I'm in the new school where they squeeze everything onto a high speed serial bus.  Or at least a single DDR buss where 1 data buss for both DACs using 1 clock for everything.

 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #19 on: March 17, 2022, 09:22:07 pm »
For DDR mode, see figure 66 in the data sheet.
Cut down on your wiring by half.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #20 on: March 17, 2022, 10:13:23 pm »
Not bad that the forerunner of the Cyclone series was already capable of pulling something like this off.
Actually it is sad that their entry level FPGA only have added DDR on the IO ports, better PLLs and double the available density.  Maybe more dedicated HW multipliers.

I know that they have faster series of FPGAs, but they are price prohibited and it's been 21 years.

Take a look at ram and CPU speeds and densities since 2001.  Altera FPGAs haven't really advanced all that much unless you go to the higher end Arria & Stratix and embedded ARM core devices, but above 1k$ per fpga isn't what I would call an advancement.
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #21 on: March 18, 2022, 06:11:02 am »
Ohhh.  I do not like the way that digital interface works with 4 clocks.  Well, I guess I'm in the new school where they squeeze everything onto a high speed serial bus.  Or at least a single DDR buss where 1 data buss for both DACs using 1 clock for everything.

For DDR mode, see figure 66 in the data sheet.
Cut down on your wiring by half.

I bought the module on Aliexpress and it came with some examples for boards with Xilinx Spartan6 devices. I used them as starting point for what I'm doing now. It was cheap when I bought it. Now almost double what I paid for it |O

Sure with high speed serial buses things are simpler and even faster, but not a lot of cheap modules available with these kind of devices on them.

The DDR mode is interesting. Something to investigate.

Edit: I looked at the datasheet for this DDR mode, which they call interleaved mode and think it lowers the max samples per second per DAC. The write clock is still bound to the 125MHz limit, and the data is only clocked on the rising edge. So not a true dual data rate setup.

It might be needed for my setup to introduce a phase difference between the WRT and CLK signals. For the interleaved mode they state:
Code: [Select]
At 5 V it is permissible to drive IQWRT and IQCLK together as shown in Figure 65, but at 3.3 V the interleaved data transfer is not reliable.

and there is some mentioning about the rising edge of CLK when being after the rising edge of WRT it needs a minimum delay of 2ns to be reliable. I'm running the IO and DAC on 3.3V.

High speed hardware is turning out to be much harder then software ;)
« Last Edit: March 18, 2022, 07:44:02 am by pcprogrammer »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #22 on: March 18, 2022, 06:20:41 am »
Actually it is sad that their entry level FPGA only have added DDR on the IO ports, better PLLs and double the available density.  Maybe more dedicated HW multipliers.

I know that they have faster series of FPGAs, but they are price prohibited and it's been 21 years.

Take a look at ram and CPU speeds and densities since 2001.  Altera FPGAs haven't really advanced all that much unless you go to the higher end Arria & Stratix and embedded ARM core devices, but above 1k$ per fpga isn't what I would call an advancement.

Guess it is the same for Xilinx. And on top of the high prices for their high end devices you have to pay for the software to program them. When I first used the Xilinx FPGA's I was working for a subsidized foundation and got a good deal on XACT but it was still expensive and only suited for the low density devices. (Up to the XC3064)

At least now these companies provide the software for free for the lower range of devices, which makes it usable for a hobbyist like me to play with FPGA's

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #23 on: March 21, 2022, 01:12:19 pm »
Another thing, it may be better to make your second PLL output run at 250 MHz and run this code to generate your DAC clocks:
Code: [Select]
module (
input clk_125,  // feed in 125MHz system clock from PLL
input clk_250,  // feed in 250MHz system clock from PLL
(* useioff = 1 *)(*preserve*) output reg dac_1_clk, // force preserve a logic cell at the IO pin to feed this clock
(* useioff = 1 *)(*preserve*) output reg dac_2_clk  // without the preserve, the FPGA compiler simplify by wiring 1 logic cell output to 2 IOs generating 1 clock ahead of the other.
)
parameter bit inv_clk_phase  = 0;

reg clk_half = 0;
reg clk_half_buffer = 0;
reg clk_full_buffer1 = 0 ;
reg clk_full_buffer2 = 0 ;
reg clk_full_out ;

always @(posedge clk_125) begin
clk_half <= !clk_half ; // Generate a 62.5MHz clock in phase with the 125MHz dac data.
end

always @(posedge clk_250) begin
clk_full_buffer1 <= clk_half ;
clk_full_buffer2 <= clk_full_buffer1 ;
clk_full_out     <= clk_full_buffer1 ^ reg clk_full_buffer2 ; // Generate a single 250MHz pulse once every 'clk_half' toggle.

dac_1_clk <=  inv_clk_phase ^ clk_full_out ; //  Copy the 'clk_full_out' to an IO buffer with a parameter option to invert it.
dac_2_clk <=  inv_clk_phase ^ clk_full_out ;
end

endmodule

This should make the IO's logic cell flipflop drive the 125MHz output clock instead of the FPGA fabric's global clock being tied to the IO pin through a fuse.

I have been playing with this supplied code in ModelSim and got it working after fixing a typo after the xor command. (In "clk_full_out <= clk_full_buffer1 ^ reg clk_full_buffer2 ; " the "reg" before "clk_full_buffer2" should not be there) and it provides the correct phase shift for the clock signal.

What I wonder about is what if one wants a 45 degree phase shift with a method like this. Will it work with the PLL clock raised to 500MHz.

Also is it better to have the PLL deliver the two clocks (250MHz and 125MHz) or derive the 125MHz just like the 62.5MHz clock.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #24 on: March 21, 2022, 05:26:58 pm »
Another thing, it may be better to make your second PLL output run at 250 MHz and run this code to generate your DAC clocks:
Code: [Select]
module (
input clk_125,  // feed in 125MHz system clock from PLL
input clk_250,  // feed in 250MHz system clock from PLL
(* useioff = 1 *)(*preserve*) output reg dac_1_clk, // force preserve a logic cell at the IO pin to feed this clock
(* useioff = 1 *)(*preserve*) output reg dac_2_clk  // without the preserve, the FPGA compiler simplify by wiring 1 logic cell output to 2 IOs generating 1 clock ahead of the other.
)
parameter bit inv_clk_phase  = 0;

reg clk_half = 0;
reg clk_half_buffer = 0;
reg clk_full_buffer1 = 0 ;
reg clk_full_buffer2 = 0 ;
reg clk_full_out ;

always @(posedge clk_125) begin
clk_half <= !clk_half ; // Generate a 62.5MHz clock in phase with the 125MHz dac data.
end

always @(posedge clk_250) begin
clk_full_buffer1 <= clk_half ;
clk_full_buffer2 <= clk_full_buffer1 ;
clk_full_out     <= clk_full_buffer1 ^ reg clk_full_buffer2 ; // Generate a single 250MHz pulse once every 'clk_half' toggle.

dac_1_clk <=  inv_clk_phase ^ clk_full_out ; //  Copy the 'clk_full_out' to an IO buffer with a parameter option to invert it.
dac_2_clk <=  inv_clk_phase ^ clk_full_out ;
end

endmodule

This should make the IO's logic cell flipflop drive the 125MHz output clock instead of the FPGA fabric's global clock being tied to the IO pin through a fuse.

I have been playing with this supplied code in ModelSim and got it working after fixing a typo after the xor command. (In "clk_full_out <= clk_full_buffer1 ^ reg clk_full_buffer2 ; " the "reg" before "clk_full_buffer2" should not be there) and it provides the correct phase shift for the clock signal.

What I wonder about is what if one wants a 45 degree phase shift with a method like this. Will it work with the PLL clock raised to 500MHz.

Also is it better to have the PLL deliver the two clocks (250MHz and 125MHz) or derive the 125MHz just like the 62.5MHz clock.

System clock timing on the FPGA is designed around the PLL.  IF you ever need performance and do not want to get deep into timing, always use the PLL clock outputs directly to clock logic on the FPGA.

The purpose of the 'clk_half <= !clk_half' is not to create a clock, but to create a toggle data output bit which toggles every new piece of data which you want to transmit to the DAC.  If the DAC was external ram, think of this wire as an address bit 0.

The purpose of the 'always @(posedge clk_250) begin' is that every time there is a transition in you dac's 'address bit 0' net called 'clk_half', at that time we also capture the data to be sent to the dac and on the next clock, we pulse the DAC clk pin for 1/2 a 125MHz cycle.  Now, when wiring an FPGA's PLL clock directly to an IO pin, unless that IO pin is the 'special' dedicated FPGA PLL CLK output pin, all other pins report a warning in the compilation that output jitter and timing is not guaranteed as this internal clock wiring path from internal global clock which the PLL outputs are wired to feeding a generic or multiple IO pins doesn't have internal specific wiring for that purpose.  The best generic IO pin performance comes when each IO pin's closest dedicated logic cell's Q data out drives that IO pin directly.  Since your DAC uses multiple clocks in parallel, making this tiny 250MHz section drive all those IO pins as normal logic data output to synthesis multiple clock in parallel will make all those parallel IOs as clean as possible.

So, in conclusion, if you want to shift your DAC's output clock by 180 degrees, use my parameter.  For smaller increments, (0 to -90 deg, then use my inv parameter and again 0 to -90 deg.) change the PLL's output phase of the 250MHz section.  This way, you let the compiler work out the timing between your logic 125MHz core and that tiny 250MHz IO pin driver.

The only reason to go to 500MHz would be is you want to make your core 250MHz, which your 125MHz dac isn't fast enough anyways.

As for real time operation tuning, there does exist a set of PLL input controls which will allow you to program manually stepping in something like 11.25 degree increments each PLL output while the system is operating, but that may be beyond your programming development needs at this time.  I would say for now, try combinations of the 'inv_clk_phase' parameter and adjusting the output phase of the 250MHz clock to -45 degrees and -90 degrees as this should allow you to create a clean window for the DAC to sample the data.
« Last Edit: March 21, 2022, 05:33:11 pm by BrianHG »
 
The following users thanked this post: pcprogrammer

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #25 on: March 21, 2022, 05:36:25 pm »
Note that depending on your PLL's settings, the phase may be in negative degrees, or negative nanosecond/picosecond settings.  I just used to how Quartus' PLL configuration does it.
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #26 on: March 21, 2022, 05:50:10 pm »
By what I have seen of it the PLL phase is set in positive degrees and it is possible to do it in nanoseconds too. Have not tested your clock trick in the actual FPGA yet. Working on the MCU interface now. See my other post with a question about write enable.

Edit: With the three clocks out of the PLL and using inverters between the PLL clocks and the outputs I got rid of the warnings about the source/sink problem, but I can see with the scope that the signals are not very accurate phase wise, so do need to switch to your solution.

Managed to get ModelSim working in a standalone manner to test the separate modules first.

Attached is a screen cap of the Tang Dynasty IDE PLL IP generator window for setting the clocks.
« Last Edit: March 21, 2022, 05:53:34 pm by pcprogrammer »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #27 on: March 21, 2022, 05:56:57 pm »
You should only need 2 clocks, #1 will be the 125MHz at 0 degrees (this is where everything in your design should be clocked from), #2 will be 250MHz, at an optional negative phase shift.  (this is reserved exclusively for synthesizing the DAC CLK pins.)

You are using a negative phase shift since the data coming in from the 125MHz side needs to be clean and ready before the 250MHz clocked logic samples the 'clk_half' which should be running in parallel with the global system 125MHz clock.  If you place a positive number here and the reg 'clk_half' barely makes it in time since it may be super fast, like ready in 1ns, the 250MHz side may sample this signal randomly as a 1 or 0 as it grabs that nets data in the middle of a transition.

Note that in some systems, it may be advantageous to make the 250MHz first, then the 125MHz second.  However, since you will be tuning the 250MHz only, and it only has something like 8 logic cells in that section, I would place it second.
« Last Edit: March 21, 2022, 05:59:48 pm by BrianHG »
 
The following users thanked this post: pcprogrammer

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #28 on: March 24, 2022, 05:00:19 pm »
Did you ever get the .sdc file to be properly recognized & used?
Does your compiler claim that the timings were met?
Did you get rid of the output dac spikes?

Note that to do so, you may need add more lines to your .sdc file.  You will need to define your DAC's IO output setup and hold times.

I do know in Quartus, once I pass ~200MHz, if I do not define the setup and hold, each output on a bus may have + or - skewing compared to the clock output.
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #29 on: March 24, 2022, 05:57:27 pm »
Yes the .sdc file works. With the MCU interface cleaned up, I had to remove some constraints since some signals were removed and now all the warnings are gone. :-+

Still have the spikes in the output, but might have to do with the DAC write and clock signals not being implemented like you showed yet. Did the MCU with your help interface first.

The timing report states a max freq on the core_clock (125MHz signal) of ~220MHz, so there appears to be enough room for more logic.

Getting to know the IDE and what has to do with FPGA more and more.

Cheers,
Peter

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #30 on: March 31, 2022, 10:38:52 am »
Working with higher frequencies is definitely a pain in the bum |O

There is so much interference that it is difficult to pin point problems when measuring the signals. With the 32 bit logic analyzer in my Yokogawa DL9705L I can hook up all the signals of the two DAC's, and then sync on the spikes with a trigger delay, but with the different latches involved it is hard to spot the cause. >:(

With experimenting on the phase of the two signals loading the data in a DAC I was able to get rid of the spikes.

The suggested solution did not do the trick and I ended up with just using two 250MHz outputs of the PLL clocking two dividers to make the signals.

Also found that the given directives (*preserve*)(* useioff = 1 *) do nothing in the Tang Dynasty compiler. I tried with different formats like (* preserve = 1 *) or (* preserve = "yes" *) but it makes no difference. There is a "schematic" viewer in the IDE with which I can follow the signals and it just refuses to put the registers in the IO blocks :o

Played a bit with the chipviewer that is available in the IDE. It is usable for "slower" signals. Within the bit stream logic is added to capture the signals you specify to be monitored. This is extra logic is connected to a clock you assign to it. This means extra fan out on this clock and thus lowers the maximum frequency. Tried a setup with an extra high speed clock from the PLL, but that needs to be connected to some "logical" logic, because otherwise it is optimized out and not available to connect to the chipviewer. Maybe disabling optimization can help there, but then it might screw up the intended logic.

For the DAC clock signals it did not work due to the clock phases and speed. In the viewer the four signals all had the same phase as the core clock, which I know is not the case based on the external measurements.

At least I learned new things and got rid of the spikes.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #31 on: March 31, 2022, 05:17:11 pm »
Note that with some FPGAs, you might be able to use / instantiate an IO block.  In the parameters for said IO block, can request features like DDR, IO cell on the IO pin, fine delay, drive strength, differential...

In LAttice/Quartus, we may use an IO_BUF.

Such IO buffers will contain a DFF clock, Enable, OE, and a few other options.
Also note that such a piece of code will lock you into you IDE's device, so make this code at the top hierarchy just before the IOs.

The other choice is to continue filling out the .sdc file.  Placing strict output delay and output hold times will force the compiler to use DFF blocks either close to of on the IO pin, if not, it will still have to clean up the timing according to your .sdc settings.  This is more standard and would be recognized within n Quartus and Xilinx.  For some reason, Lattice has moved away from the .sdc standard and you need to fill in the same values elsewhere in their IDE.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #32 on: March 31, 2022, 05:29:56 pm »
Note that if you use an IO buffer and use it's DDR output function, (I'm not talking about the DDR inside your DAC), you can change your 250MHz clock to 125MHz.  Get rid of my PSEUDO DDR 125MHz clock generator and for all your data lines, tie them into the IO buffers Hi and Low while for the clock outputs, tie the output buffers Hi to 1'b1, and the low to 1'b0 with an option to invert.  Also, for those clock outs, use the separate 125MHz output so you may precisely tune it's phase output.

Note that some IO on you fpga might not support DDR.  If those are on your dac, they will need to be moved.  This will provide super clean timing as DDR IO buffers usually can make it into the GHz.
« Last Edit: March 31, 2022, 05:49:18 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #33 on: March 31, 2022, 06:58:06 pm »
I have looked at what the IDE does for the DAC data pins, and there it is using the DFF within the IO block, so it is capable of doing it. This with or without the "useioff" directive.

These DAC data pins are on the 125MHz clock, while the other DAC signals are on the 250MHz clocks, so could it be that it can't route these clock signals to the IO blocks with the high speed clock routes, and therefore refuses to do it?

I will try with the timing constraints if I can force the DAC clock signals to do the same. The DFF's it is using now are in slices near the IO pins, so timing might not change that much. For the DAC to work properly on the 125MHz, the two lines (wrt and clk) either have to be rising on the same moment or well out of phase. Measurements showed that the"same" on two different FPGA pins does not fly on these higher frequencies. I noticed shifts in the order of 600 - 1200ps, which the DAC does not like. With the signals made from two 90 degree shifted 250MHz clocks it solved the problem.

Thinking about your DDR option on these pins. That way they would still deliver 125MHz while being clocked on 125MHz. Only shifting 90 degree in phase is not an option then?

I thought about using low level macro's to control the usage of the DFF in the IO block, but like you stated this kills portability. I noticed that the compiler uses these macro's when prepping for simulation. It creates flat verilog files to load into the simulator.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #34 on: March 31, 2022, 07:11:19 pm »
Thinking about your DDR option on these pins. That way they would still deliver 125MHz while being clocked on 125MHz. Only shifting 90 degree in phase is not an option then?

Inverting the Hi and Low data inputs on the DDR will allow 180 degree shift.

Adjusting the phase of the second 125MHz PLL output will allow 90 degree shift, or 45, or many other multiples depending on the phase setting of that PLL output.  Since it is now 125MHz, this setting should have more precise sub-divisions.  The plus here is that those DDR output cells for your clocks on the phase shifted 125MHz have a fixed data inputs of 1'b1 and 1'b0, no connection to your main 125MHz 0 degree data clock.  Hence, no cross clock domain communication eliminating such metastability issues you may have had when going from the 125MHz clock domain to the 250MHz clock domain when using my 2 clock code.

The advantage of using the same DDR output buffers for the DAC data, those on the master 0 degree clock with the Hi and Low tied to the same data, is that all the DDR buffers should have a very refined parallel performance as they were designed to communicate exceedingly fast to DDR2/3 external ram.  This is so long as you have chosen all IOs which have DDR capability.
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #35 on: March 31, 2022, 07:32:11 pm »
Ahh, I see your point now.

The whole DDR thing is a bit new to me. I knew it meant clocking data on both the positive as well as the negative going edge, but did not think about how this worked with the data on these drivers. Two separate inputs for the two data stages makes sense.

A 180 degree phase shift might also solve the spikes problem. As it stands now the PLL setup is 0 degree for the 125MHz and 90 degree on the 250MHz for the wrt lines and 180 degree on the 250MHz for the clk lines. There is no mixing of the clock's like you did in your code. Not behind my dev machine so can't post the code, but it is a simple divide by two setup like:

Code: [Select]
always @(posedge clk_250_phase_90)
  begin
    dac1_wrt <= !dac1_wrt;
    dac2_wrt <= !dac2_wrt;
  end

The compiler optimizes this to a single DFF connected to both the outputs.

Guess there is more to experiment with. :)

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #36 on: March 31, 2022, 08:01:06 pm »
Ahh, I see your point now.

The whole DDR thing is a bit new to me. I knew it meant clocking data on both the positive as well as the negative going edge, but did not think about how this worked with the data on these drivers. Two separate inputs for the two data stages makes sense.

A 180 degree phase shift might also solve the spikes problem. As it stands now the PLL setup is 0 degree for the 125MHz and 90 degree on the 250MHz for the wrt lines and 180 degree on the 250MHz for the clk lines. There is no mixing of the clock's like you did in your code. Not behind my dev machine so can't post the code, but it is a simple divide by two setup like:

Code: [Select]
always @(posedge clk_250_phase_90)
  begin
    dac1_wrt <= !dac1_wrt;
    dac2_wrt <= !dac2_wrt;
  end

The compiler optimizes this to a single DFF connected to both the outputs.

Guess there is more to experiment with. :)

No, you need to instantiate a DDR PHY, 1 per pin.
I never used your FPGA, so I do not know how it works or what the IO Buffer is called.
But when you instantiate one, it will be placed on the FPGA.  In your verilog, you should also be able to define the IO voltage, slew rate, output current, and much more, even maybe PAD / pin IO number/name.

If you run the DDR at 250MHz, your output clock would be 250MHz.
You need to run it at 125MHz.

The HI and LOW inputs are internally samples on the positive clock edge, ie 125MHz which you are clocking it at.
For the outputs, while the clk input is high, the output pin will show the HI input value sampled at the positive edge of the 125MHz source.  When clk input is low, the output pin will show what the LOW input was sampled also at the positive edge of clk in.  So, inside the DDR buffer, there are special sample DFF gate which can operate at 2x speed to shift the main 125MHz clock from the positive to the next negative edge.  Remember, it is the goal of the FPGA to internally all operate with 1 clock at 1 phase, all positive clocked to achieve the best FMAX performance.  It is the job of just the DDR circuitry in the IO buffer alone to send and receive data to the core at 1 positive edge main clock while implementing the half phase shift only at the IO pins.
« Last Edit: March 31, 2022, 08:13:57 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #37 on: April 01, 2022, 05:28:23 am »
Guess there is more to experiment with. :)

No, you need to instantiate a DDR PHY, 1 per pin.
I never used your FPGA, so I do not know how it works or what the IO Buffer is called.

The code sniplet I showed is from how it is working at the moment, and not a setup for DDR. That is why at the end I stated about the more to experiment with.

I have to research how and even if DDR PHY's are available on this FPGA. There is an .adc file in which IO constraints are set, so it will probably be the place to setup an IO pin as DDR. The calling it DDR PHY is an other clue that helps :-+

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #38 on: April 01, 2022, 05:36:31 am »
Careful, DDR PHY sometimes points you to a complete DDR2/3 ram interface module.  You are looking for a DDR IO Buffer.
 
The following users thanked this post: pcprogrammer

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #39 on: April 01, 2022, 10:00:11 am »
Found how to use the DDR output in the Anlogic FPGA. Within the Tang Dynasty IDE an ODDR IP has to be generated. This gives a verilog module for a DDR output with the ability to hook on the needed signals.

For now just did it for the DAC control signals and the output on the scope looks to be spike free. At least on the Hantek DSO2D10. (This one starts the quickest and makes no noise)

Have to find a way to use this ODDR IP in a bus manner because having to hook up this module to every DAC data pin takes a lot of code. Maybe there is some way to use macro's for this? Not sure if it is needed though, since it is already using the DFF of the IO block for these signals. But as BrianHG pointed out the DDR logic might be faster.

In this top level code it needs the four module connections near the end to hook up the control signals. I checked with the schematic viewer and it is using the AL_PHY_PAD for the ODDR logic.

Code: [Select]
//---------------------------------------------------------------------------
//Main module for connections with the outside world

module FA201_Lichee_nano
(
  //Input signals
  input wire i_xtal,                     //50 MHz clock
  input wire i_mcu_data_strobe,          //Active low going pulse from the mcu to strobe the data
  input wire i_mcu_control_srobe,        //Active low going pulse from the mcu to strobe the control
  input wire i_mcu_read_write_select,    //Read 0 / write 1

  //Bi-directional parallel data bus to / from the mcu       
  inout wire [7:0] io_mcu_data,

  //Output signals
  output o_dac1_clk,
  output o_dac1_wrt,
  output o_dac2_clk,
  output o_dac2_wrt,

  output wire [13:0] o_dac1_d,
  output wire [13:0] o_dac2_d
);

  //---------------------------------------------------------------------------
  //Internal wires

  wire core_clock; 
   
  wire [47:0] channel1_negative_signal_step;
  wire [47:0] channel1_positive_signal_step;
  wire [47:0] channel2_negative_signal_step;
  wire [47:0] channel2_positive_signal_step;

  //---------------------------------------------------------------------------
  //Connection with the sub modules
 
  pll_clock pll 
  ( 
    .refclk   (i_xtal),
    .reset    (1'b0),
    .clk0_out (core_clock)
  );
 
  mcu_interface mcu 
  ( 
    .i_main_clk                      (core_clock),
    .i_data_strobe                   (i_mcu_data_strobe),
    .i_control_strobe                (i_mcu_control_srobe),   
    .i_read_write_select             (i_mcu_read_write_select),
    .io_data                         (io_mcu_data),   
    .o_channel1_negative_signal_step (channel1_negative_signal_step),
    .o_channel1_positive_signal_step (channel1_positive_signal_step),
    .o_channel2_negative_signal_step (channel2_negative_signal_step),
    .o_channel2_positive_signal_step (channel2_positive_signal_step)
  ); 
 
  awg dac1
  (
    .i_main_clock           (core_clock),
    .i_negative_signal_step (channel1_negative_signal_step),
    .i_positive_signal_step (channel1_positive_signal_step),
    .o_dac_d                (o_dac1_d)
  );

  awg dac2
  (
    .i_main_clock           (core_clock),
    .i_negative_signal_step (channel2_negative_signal_step),
    .i_positive_signal_step (channel2_positive_signal_step),
    .o_dac_d                (o_dac2_d)
  );
 
  output_ddr dac1_wrt
  ( 
    .clk   (core_clock), 
    .rst   (1'b0),
    .d1    (1'b0),
    .d2    (1'b1),   
    .q     (o_dac1_wrt)
  ); 
 
  output_ddr dac1_clk
  ( 
    .clk   (core_clock), 
    .rst   (1'b0),
    .d1    (1'b1),
    .d2    (1'b0),   
    .q     (o_dac1_clk)
  ); 
 
  output_ddr dac2_wrt
  ( 
    .clk   (core_clock), 
    .rst   (1'b0),
    .d1    (1'b0),
    .d2    (1'b1),   
    .q     (o_dac2_wrt)
  ); 

  output_ddr dac2_clk
  ( 
    .clk   (core_clock), 
    .rst   (1'b0),
    .d1    (1'b1),
    .d2    (1'b0),   
    .q     (o_dac2_clk)
  ); 
 
endmodule

//---------------------------------------------------------------------------

It did drop the max frequency a bit.

Code: [Select]
Timing group statistics:
Clock constraints:
  Clock Name                                  Min Period     Max Freq           Skew      Fanout            TNS
  core_clock (125.000MHz)                        5.528ns     180.897MHz        0.078ns       227        0.000ns

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #40 on: April 01, 2022, 10:32:05 am »
Have to find a way to use this ODDR IP in a bus manner because having to hook up this module to every DAC data pin takes a lot of code. Maybe there is some way to use macro's for this?

You can use compiler's 'genvar & generate' within your own single function to call multiple instances of the 'output_ddr'.
Your function should take in a data bus in H and L of parameter X bits and CLK of course, and output a data bus DDR with x bits.

Take a look at lines 148, 153 through 156 here:
https://github.com/BrianHGinc/BrianHG-DDR3-Controller/blob/main/BrianHG_DDR3_GFX_source_v16/BrianHG_GFX_Layer_mixer.sv

I have a home made module 'ALPHA_ADJ' I am calling multiple times.  You would be placing the 'output_ddr' inside such a loop.

Properly written, your home made 'multibit_output_ddr' function should be something like 6 lines of code.
You would run 1 for the output data bus set to 12 or 24 bits, and another 1 set to 4 bits for all your clocks.
Use the concatenation in the IO port to stack the IO pins.
 
The following users thanked this post: pcprogrammer

Online Someone

  • Super Contributor
  • ***
  • Posts: 4956
  • Country: au
    • send complaints here
Re: Why does this dds code fail
« Reply #41 on: April 01, 2022, 11:43:42 am »
You can use compiler's 'genvar & generate' within your own single function to call multiple instances of the 'output_ddr'.
Yep, thats the HDL-agnostic concept. Verilog has a shortcut where you can broadcast single control signals to arrays of instances care of its lax typing rules in a very readable one liner:
https://stackoverflow.com/questions/21615210/instantiating-multiple-modules-in-verilog
Works great for exactly this common issue of sending a bus through some vendor primitives.
 
The following users thanked this post: BrianHG, pcprogrammer

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #42 on: April 02, 2022, 09:13:07 am »
Used the vector option pointed out by Someone and it works. It is spike free with the right order of the DAC wrt and clk signals. When I swap them the spikes come in again. Less frequent then with the earlier designs, but there.

So the foundation has been laid. Next up is more work on the MCU interface part to allow updating of the two step registers in a single action and also allow for setting the signal phase registers.

Attached is the project so far.

Below is the module with the ODDR vector approach:

Code: [Select]
//----------------------------------------------------------------------------------
//Module for generating the DAC signals

module awg
(
  //Input
  input i_main_clock,
   
  input [47:0] i_negative_signal_step,
  input [47:0] i_positive_signal_step,   

  //Output
  output wire o_dac_clk,
  output wire o_dac_wrt, 

  output wire [13:0] o_dac_d
);

  //--------------------------------------------------------------------------------
  //Registers

  reg [47:0] signal_phase = 0;

  //--------------------------------------------------------------------------------
  //Logic
 
  always@(posedge i_main_clock)
    begin 
      if(signal_phase[47] == 1'b0)       
        signal_phase <= signal_phase + i_negative_signal_step;     
      else
        signal_phase <= signal_phase + i_positive_signal_step;     
    end
   
  output_ddr dac_data[13:0]   
  ( 
    .clk   (i_main_clock), 
    .rst   (1'b0),
    .d1    (signal_phase[47:34]),
    .d2    (signal_phase[47:34]),   
    .q     (o_dac_d)
  );   

  output_ddr dac_wrt
  ( 
    .clk   (i_main_clock), 
    .rst   (1'b0),
    .d1    (1'b0),
    .d2    (1'b1),   
    .q     (o_dac_wrt)
  ); 
 
  output_ddr dac_clk
  ( 
    .clk   (i_main_clock), 
    .rst   (1'b0),
    .d1    (1'b1),
    .d2    (1'b0),   
    .q     (o_dac_clk)
  ); 

endmodule

//----------------------------------------------------------------------------------

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #43 on: April 02, 2022, 12:00:26 pm »
? Your project disappeared...

Anyways, it should be like this (as I saw it in the DAC data sheet):

Code: [Select]
//----------------------------------------------------------------------------------
//Module for generating the DAC signals

module awg
(
  //Input
  input i_main_clock,  // This is the main PLL 125MHz output clock set to 0 degrees.
  input i_dac_clock,   // This is a second 125MHz clock from your PLL, output #2, but set to 90 degrees offset.
   
  input [47:0] i_negative_signal_step,
  input [47:0] i_positive_signal_step,   

  //Output
  output wire o_dac_clk,
  output wire o_dac_wrt, 

  output wire [13:0] o_dac_d
);

parameter bit INVERT_CLK = 0 ; // Set to 1 to invert the clock output.

  //--------------------------------------------------------------------------------
  //Registers

  reg [47:0] signal_phase = 0;

  //--------------------------------------------------------------------------------
  //Logic
 
  always@(posedge i_main_clock)
    begin 
      if(signal_phase[47] == 1'b0)       
        signal_phase <= signal_phase + i_negative_signal_step;     
      else
        signal_phase <= signal_phase + i_positive_signal_step;     
    end
   
  output_ddr dac_data[13:0]   
  ( 
    .clk   (i_main_clock), 
    .rst   (1'b0),
    .d1    (signal_phase[47:34]),
    .d2    (signal_phase[47:34]),   
    .q     (o_dac_d)
  );   

  output_ddr dac_wrt
  ( 
    .clk   (i_dac_clock), 
    .rst   (1'b0),
    .d1    (INVERT_CLK),
    .d2    (!INVERT_CLK),   
    .q     (o_dac_wrt)
  ); 
 
  output_ddr dac_clk
  ( 
    .clk   (i_dac_clock), 
    .rst   (1'b0),
    .d1    (INVERT_CLK),
    .d2    (!INVERT_CLK),   
    .q     (o_dac_clk)
  ); 

endmodule

//----------------------------------------------------------------------------------


With this setup, probing the dac_clk outputs and data outputs on 2 different scope channels should reveal a perfect 90 degree phase delay relationship as shown to be optimal in the dac's data sheet.

You may need to toggle the 'INVERT_CLK', or swap one of the '!' invert clks for the dac clk.
« Last Edit: April 02, 2022, 12:03:31 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #44 on: April 02, 2022, 12:39:51 pm »
? Your project disappeared...

How strange, but well here is a new one. I added the possibility to reset the signal phase registers and a way to add a set value to either one of them to change the phase relation between the two channels. Also made it so that the step values are loaded simultaneous on a separate control.

I will try the 90 degree phase option next.

Edit: The 90 degree phase option also works. Without measuring the actual write and clock signals I can't tell the difference. The saw tooth outputs look good and stable.
« Last Edit: April 02, 2022, 01:06:42 pm by pcprogrammer »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #45 on: April 02, 2022, 02:18:38 pm »
Another question about verilog I can't find an answer for:

To allow the selection of the different wave-forms the data going to the DDR output needs a different assignment but using a case statement or an if else structure seems to be only possible within always blocks and then can't assign data to a wire. This means I would have to make another register to clock the selected data into and use that register as input for the DDR output.

I have tested different assignments for making the different signals and they work, but I need a way to select one of them based on data in a register.

Code: [Select]
  assign dac_signal_data = signal_phase[47] ? 14'h3FFF : 14'h0;

//This is ok for ramp down
//  assign dac_signal_data = ~signal_phase[47:34];
 
  //This is ok for ramp up
//  assign dac_signal_data = signal_phase[47:34];

Is there a proper way of doing this or do I need the register approach?

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #46 on: April 02, 2022, 07:20:00 pm »
It's pen and paper time...
Please visually illustrate to me what you want to achieve with your waveform.
I think you are going about things in a completely backwards manner.
I need to see the function output you are trying to generate.
Also, do you have a modelsim testbench of your code?
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #47 on: April 02, 2022, 07:35:13 pm »
This is something for tomorrow 8)

Unfortunately I have not yet managed to get the simulator to work with the Anlogic code. I did do some separate simulation on the MCU interface and some clock stuff, but with the PLL and now the ODDR code in there it gets a bit complicated.

I saw your post in the other thread about the registered instead of combinatoral control lines. I see your point that using registers gives a cleaner setup then using the combinational stack of gates. I assume the compiler will reduce the number of registers down to the ones that are actually used.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #48 on: April 02, 2022, 07:41:59 pm »

Unfortunately I have not yet managed to get the simulator to work with the Anlogic code. I did do some separate simulation on the MCU interface and some clock stuff, but with the PLL and now the ODDR code in there it gets a bit complicated.

Skip the ODDR and PLL.
Those 2 should only be at your top hierarchy driving the IO.
But yes, you need the -l library include on the -vsim line.
 

Online Someone

  • Super Contributor
  • ***
  • Posts: 4956
  • Country: au
    • send complaints here
Re: Why does this dds code fail
« Reply #49 on: April 02, 2022, 10:47:04 pm »
using a case statement or an if else structure seems to be only possible within always blocks and then can't assign data to a wire. This means I would have to make another register to clock the selected data into and use that register
Sounds like yo have picked up Verilog on a as-needs basis rather than learning it from scratch (have met several people who have done this). Recommendation, STOP! HDL coding is nothing like other software even if it looks the same. If you jump in without understanding the basics, and learning the strict terminology, it will be very confusing.

Sadly the freely available references/guides/courses for Verilog are not as numerous or quality as VHDL, but there are some out there that cover these sorts of fundamentals:
https://verilogguide.readthedocs.io/en/latest/verilog/procedure.html

Just because Verilog calls it a register doesn't mean that becomes a register in implementation, confusing!
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #50 on: April 03, 2022, 05:21:14 am »
using a case statement or an if else structure seems to be only possible within always blocks and then can't assign data to a wire. This means I would have to make another register to clock the selected data into and use that register
Sounds like yo have picked up Verilog on a as-needs basis rather than learning it from scratch (have met several people who have done this). Recommendation, STOP! HDL coding is nothing like other software even if it looks the same. If you jump in without understanding the basics, and learning the strict terminology, it will be very confusing.

Sadly the freely available references/guides/courses for Verilog are not as numerous or quality as VHDL, but there are some out there that cover these sorts of fundamentals:
https://verilogguide.readthedocs.io/en/latest/verilog/procedure.html

Just because Verilog calls it a register doesn't mean that becomes a register in implementation, confusing!
What can I say, nowadays I'm finding it hard to plow through documentation without a direct practical use case, and then even keeping focus is tough. With chronic fatigue syndrome things slow down. But still like to hobby.

I choose verilog over vhdl based on it being closer to C syntax. You are right that there is not a lot about verilog on the net, but I did find a couple:

http://www.asic-world.com/verilog/index.html
https://www.chipverify.com/verilog/verilog-tutorial

Sadly most the given examples do not provide the information I'm looking for, and learning from professionals helps a lot.
It does require another way of thinking doing hardware instead of software, which is tricky after so many years of just doing software.

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #51 on: April 03, 2022, 06:01:23 am »
It's pen and paper time...
Please visually illustrate to me what you want to achieve with your waveform.
I think you are going about things in a completely backwards manner.
I need to see the function output you are trying to generate.
Also, do you have a modelsim testbench of your code?

The generator I have working now only gives a saw tooth at the output of the DAC. I like to be able to select different wave forms via a control register. I tested the other wave forms by hard coding the assignments and they work.

Eventually it should also have an arbitrary lookup table based output, which is not that hard to do. Setup block ram and address it with part of the signal phase counter.

The problem lies in the multiplexing between the different data. So I'm looking for the proper way to do that. I have drawn a simple schematic like you asked.

I tried it with a case statement but got errors about not being able to write to a wire, which in a way makes sense.

Code: [Select]
reg [2:0] waveform_control = 0;
wire [13:0] dac_signal_data;

always@(waveform_control)
case(waveform_control)
default: dac_signal_data = signal_phase[47:34];
3'h1:  dac_signal_data = ~signal_phase[47:34];
3'h2: dac_signal_data = signal_phase[47] ? 14'h3FFF : 14'h0;
endcase

Just tried it with a reg and it does work, but the FMAX takes a hit. Drops down from ~180MHz to ~159MHz

Code: [Select]
reg [2:0] waveform_control = 0;
reg [13:0] dac_signal_data;

always@(waveform_control or signal_phase[47:34])
case(waveform_control)
default: dac_signal_data = signal_phase[47:34];
3'h1:  dac_signal_data = ~signal_phase[47:34];
3'h2: dac_signal_data = signal_phase[47] ? 14'h3FFF : 14'h0;
endcase

So is it better practice to also use the main clock (125MHz) for this always block and have the data clock into an actual DFF.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8089
  • Country: ca
Re: Why does this dds code fail
« Reply #52 on: April 03, 2022, 04:23:15 pm »
It's pen and paper time...
Please visually illustrate to me what you want to achieve with your waveform.
I think you are going about things in a completely backwards manner.
I need to see the function output you are trying to generate.
Also, do you have a modelsim testbench of your code?

The generator I have working now only gives a saw tooth at the output of the DAC. I like to be able to select different wave forms via a control register. I tested the other wave forms by hard coding the assignments and they work.

Eventually it should also have an arbitrary lookup table based output, which is not that hard to do. Setup block ram and address it with part of the signal phase counter.

The problem lies in the multiplexing between the different data. So I'm looking for the proper way to do that. I have drawn a simple schematic like you asked.

I tried it with a case statement but got errors about not being able to write to a wire, which in a way makes sense.

Code: [Select]
reg [2:0] waveform_control = 0;
wire [13:0] dac_signal_data;

always@(waveform_control)
case(waveform_control)
default: dac_signal_data = signal_phase[47:34];
3'h1:  dac_signal_data = ~signal_phase[47:34];
3'h2: dac_signal_data = signal_phase[47] ? 14'h3FFF : 14'h0;
endcase

Just tried it with a reg and it does work, but the FMAX takes a hit. Drops down from ~180MHz to ~159MHz

Code: [Select]
reg [2:0] waveform_control = 0;
reg [13:0] dac_signal_data;

always@(waveform_control or signal_phase[47:34])
case(waveform_control)
default: dac_signal_data = signal_phase[47:34];
3'h1:  dac_signal_data = ~signal_phase[47:34];
3'h2: dac_signal_data = signal_phase[47] ? 14'h3FFF : 14'h0;
endcase

So is it better practice to also use the main clock (125MHz) for this always block and have the data clock into an actual DFF.

Your waveform control should be an input from your MCU control regs.
dac_signal_data should be your output reg going to the IO pins, or DDR buffers.

Code: [Select]
always@(posedge system_CLK_125MHz) begin
case(waveform_control)
default: dac_signal_data <= signal_phase[47:34];
3'h1:  dac_signal_data <= ~signal_phase[47:34];
3'h2: dac_signal_data <= signal_phase[47] ? 14'h3FFF : 14'h0;
endcase
end

Your missing the saw and sine.
To get the best quality sine for the least amount of memory, it is a little tricky as you would only be storing 1/4 a waveform and realtime compute the 3 other quadrants.  See how much realistic memory is available to you and use a ^2 size with a 16 bit output as your FPGA probably reserves memory in blocks which are 1/2/4/8/16 bits wide.
« Last Edit: April 03, 2022, 04:27:37 pm by BrianHG »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #53 on: April 03, 2022, 04:51:10 pm »
Your missing the saw and sine.
To get the best quality sine for the least amount of memory, it is a little tricky as you would only be storing 1/4 a waveform and realtime compute the 3 other quadrants.  See how much realistic memory is available to you and use a ^2 size with a 16 bit output as your FPGA probably reserves memory in blocks which are 1/2/4/8/16 bits wide.

The code was just a sample to make the point. I still have to get used to do everything on the "master" clock. Already tried the always@(posedge clk_125MHz) and it did raise the FMAX back up to ~180MHz, so that also answered the question I guess.

I think the memory setup is similar to the Cyclone IV. Already have some samples of that in the FNIRSI project of morris6, and found several websites about the quarter sine within FPGA. Also have experience with it in C, so don't see a problem there.

The, what you call, saw is triangle in my mind, and for my "pulse width" setup it also needs to be done in 4 quadrants. Have to make some setup with the upper two bits of the signal phase to make that work.

The project is taking proper shape now. Just a couple of steps and then back to writing C code to allow controlling of the two signals.

Again thanks for your help Brian.
Cheers,
Peter

Online Someone

  • Super Contributor
  • ***
  • Posts: 4956
  • Country: au
    • send complaints here
Re: Why does this dds code fail
« Reply #54 on: April 04, 2022, 04:03:37 am »
using a case statement or an if else structure seems to be only possible within always blocks and then can't assign data to a wire. This means I would have to make another register to clock the selected data into and use that register
Sounds like yo have picked up Verilog on a as-needs basis rather than learning it from scratch (have met several people who have done this). Recommendation, STOP! HDL coding is nothing like other software even if it looks the same. If you jump in without understanding the basics, and learning the strict terminology, it will be very confusing.

Sadly the freely available references/guides/courses for Verilog are not as numerous or quality as VHDL, but there are some out there that cover these sorts of fundamentals:
https://verilogguide.readthedocs.io/en/latest/verilog/procedure.html

Just because Verilog calls it a register doesn't mean that becomes a register in implementation, confusing!
What can I say, nowadays I'm finding it hard to plow through documentation without a direct practical use case, and then even keeping focus is tough. With chronic fatigue syndrome things slow down. But still like to hobby.

I choose verilog over vhdl based on it being closer to C syntax. You are right that there is not a lot about verilog on the net, but I did find a couple:

http://www.asic-world.com/verilog/index.html
https://www.chipverify.com/verilog/verilog-tutorial

Sadly most the given examples do not provide the information I'm looking for, and learning from professionals helps a lot.
It does require another way of thinking doing hardware instead of software, which is tricky after so many years of just doing software.
The syntax might have some C style layout/elements but how that implements is entirely different. I was pointing out (with reference to documentation) that your idea of always block = register is incorrect. Always block is an abstraction that can mean many different things, having code inside an always block does not mean there are registers (it can be combinatorial/asynchronous). Having a Verilog object declared as a register does not mean it will be implemented as a register, it could be nothing more than a passive signal "wire" (but not interchangeable with a Verilog wire), or it could be a latch, or something else like an sram cell. A Verilog register could be optimized away into the middle of blob of logic so that it cant even be observed or uniquely identified in the implementation. It is an abstraction.

There isnt one way to describe a multiplexer, I can think of 4 common multiplexer descriptions in Verilog/VHDL, all able to implement a combinatorial or clocked/sequential/registered multiplexer. Which one to use in any specific situation is dependent on the structure and style of the code/system around it. You'll keep running into mental blocks and these sorts of problems if you just copypasta blocks of code which although they do the larger function you are thinking of, you're not understanding why they are doing it. How to use an always block is where you need to start (as evidenced from your follow on code examples).
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #55 on: April 04, 2022, 03:44:43 pm »
Apart from the sine wave the other signals are implemented.

Removed the MCU read stuff since I don't need it for this project.

Attached are some screen captures of the Hantek DSO with signals generated with this hardware.

Edit: also attached a modelsim project for simulating the triangle setup.
« Last Edit: April 04, 2022, 03:48:06 pm by pcprogrammer »
 

Offline pcprogrammerTopic starter

  • Super Contributor
  • ***
  • Posts: 4321
  • Country: nl
Re: Why does this dds code fail
« Reply #56 on: April 07, 2022, 09:24:39 am »
Implemented the sine wave with a dual port rom to serve the two channels.

First direct sine creation attempt broke the FMAX, but by implementing it with a pipeline structure found on the internet FMAX is back up to ~180MHz and the signal looks good.

Had fun and learned a lot from doing this project.

 :-+ for BrianHG.


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf