Author Topic: OSERDES output data delay  (Read 775 times)

0 Members and 1 Guest are viewing this topic.

Offline alexeyey_0Topic starter

  • Newbie
  • Posts: 5
  • Country: ru
OSERDES output data delay
« on: April 02, 2025, 11:31:54 am »
Hello everyone, I'm currently writing an image processing project on systemverilog (somewhat similar to MIPI DSI). When working with the Xilinx ip block, I encountered such a problem that the data is not displayed at the moment specified in the specification. I decided to compare the implementation on systemverilog and on vhdl. The result: there is a 0.1 ps delay on sv, and data is displayed strictly in clock cycles on vhdl. I want to get the result as on vhdl, that is, without delays in sending data to the output transmission line. What could be the problem and how to solve it? I am attaching 2 testbench, one is written strictly on sv, the other on vhdl. I also attach 2 time charts.: 1 - sv, 2 - vhdl
1. sv tb

Code: [Select]
`timescale 1ns / 1ps

module tb_dvi;

  logic clk = 0;
  logic clk_x5 = 0;
  logic [9:0] data = 10'b0;
  logic reset = 1;
  logic serial;

  logic shift1;
  logic shift2;
  logic ce_delay;
  logic [7:0] reset_delay;

  localparam time clock_period = 20;

  // Master OSERDESE2
  OSERDESE2 #(
    .DATA_RATE_OQ("DDR"),
    .DATA_RATE_TQ("DDR"),
    .DATA_WIDTH(10),
    .INIT_OQ(1'b1),
    .INIT_TQ(1'b1),
    .SERDES_MODE("MASTER"),
    .SRVAL_OQ(1'b0),
    .SRVAL_TQ(1'b0),
    .TBYTE_CTL("FALSE"),
    .TBYTE_SRC("FALSE"),
    .TRISTATE_WIDTH(1)
  ) master_serdes (
    .OFB(),
    .OQ(serial),
    .SHIFTOUT1(),
    .SHIFTOUT2(),
    .TBYTEOUT(),
    .TFB(),
    .TQ(),
    .CLK(clk_x5),
    .CLKDIV(clk),
    .D1(data[0]),
    .D2(data[1]),
    .D3(data[2]),
    .D4(data[3]),
    .D5(data[4]),
    .D6(data[5]),
    .D7(data[6]),
    .D8(data[7]),
    .OCE(ce_delay),
    .RST(reset),
    .SHIFTIN1(shift1),
    .SHIFTIN2(shift2),
    .T1(1'b0),
    .T2(1'b0),
    .T3(1'b0),
    .T4(1'b0),
    .TBYTEIN(1'b0),
    .TCE(1'b0)
  );

  // Slave OSERDESE2
  OSERDESE2 #(
    .DATA_RATE_OQ("DDR"),
    .DATA_RATE_TQ("DDR"),
    .DATA_WIDTH(10),
    .INIT_OQ(1'b1),
    .INIT_TQ(1'b1),
    .SERDES_MODE("SLAVE"),
    .SRVAL_OQ(1'b0),
    .SRVAL_TQ(1'b0),
    .TBYTE_CTL("FALSE"),
    .TBYTE_SRC("FALSE"),
    .TRISTATE_WIDTH(1)
  ) slave_serdes (
    .OFB(),
    .OQ(),
    .SHIFTOUT1(shift1),
    .SHIFTOUT2(shift2),
    .TBYTEOUT(),
    .TFB(),
    .TQ(),
    .CLK(clk_x5),
    .CLKDIV(clk),
    .D1(1'b0),
    .D2(1'b0),
    .D3(data[8]),
    .D4(data[9]),
    .D5(1'b0),
    .D6(1'b0),
    .D7(1'b0),
    .D8(1'b0),
    .OCE(ce_delay),
    .RST(reset),
    .SHIFTIN1(1'b0),
    .SHIFTIN2(1'b0),
    .T1(1'b0),
    .T2(1'b0),
    .T3(1'b0),
    .T4(1'b0),
    .TBYTEIN(1'b0),
    .TCE(1'b0)
  );

  always_ff @(posedge clk) begin
    ce_delay <= ~reset;
  end

  initial forever #(clock_period / 4) clk = ~clk;

  initial forever #(clock_period / 20) clk_x5 = ~clk_x5;
 
  initial begin
    #200 reset = 0;
    #200 data  = 10'b1100010101; // 10'h315
    #20  data  = 10'b1100110101; // 10'h335
    #20  data  = 10'b1100100100; // 10'h324
    #20  $finish;
  end
 
endmodule

2. vhdl tb

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
library UNISIM;
use UNISIM.VComponents.all;

ENTITY tb_ser IS
END tb_ser;

ARCHITECTURE behavior OF tb_ser IS
   
    signal clk     : std_logic := '0';
    signal clk_x5  : std_logic := '0';
    signal data    : std_logic_vector(9 downto 0) := (others => '0');
    signal reset   : std_logic := '1'; -- Начальное состояние '1'
    signal serial  : std_logic := '0';

    signal shift1      : std_logic := '0';
    signal shift2      : std_logic := '0';
    signal ce_delay    : std_logic := '0';
    signal reset_delay : std_logic_vector(7 downto 0) := (others => '0');

    constant clock_period : time := 20 ns;
   
BEGIN

master_serdes : OSERDESE2
   generic map (
      DATA_RATE_OQ => "DDR",   -- DDR, SDR
      DATA_RATE_TQ => "DDR",   -- DDR, BUF, SDR
      DATA_WIDTH => 10,         -- Parallel data width (2-8,10,14)
      INIT_OQ => '1',          -- Initial value of OQ output (1'b0,1'b1)
      INIT_TQ => '1',          -- Initial value of TQ output (1'b0,1'b1)
      SERDES_MODE => "MASTER", -- MASTER, SLAVE
      SRVAL_OQ => '0',         -- OQ output value when SR is used (1'b0,1'b1)
      SRVAL_TQ => '0',         -- TQ output value when SR is used (1'b0,1'b1)
      TBYTE_CTL => "FALSE",    -- Enable tristate byte operation (FALSE, TRUE)
      TBYTE_SRC => "FALSE",    -- Tristate byte source (FALSE, TRUE)
      TRISTATE_WIDTH => 1      -- 3-state converter width (1,4)
   )
   port map (
      OFB       => open,
      OQ        => serial,
      SHIFTOUT1 => open,
      SHIFTOUT2 => open,
      TBYTEOUT  => open,
      TFB       => open,
      TQ        => open,
      CLK       => clk_x5,
      CLKDIV    => clk,
      D1 => data(0),
      D2 => data(1),
      D3 => data(2),
      D4 => data(3),
      D5 => data(4),
      D6 => data(5),
      D7 => data(6),
      D8 => data(7),
      OCE => ce_delay,
      RST => reset,
      SHIFTIN1 => SHIFT1,
      SHIFTIN2 => SHIFT2,
      T1 => '0',
      T2 => '0',
      T3 => '0',
      T4 => '0',
      TBYTEIN => '0',
      TCE => '0'
   );

slave_serdes : OSERDESE2
   generic map (
      DATA_RATE_OQ   => "DDR",
      DATA_RATE_TQ   => "DDR",
      DATA_WIDTH     => 10,
      INIT_OQ        => '1',
      INIT_TQ        => '1',
      SERDES_MODE    => "SLAVE",
      SRVAL_OQ       => '0',
      SRVAL_TQ       => '0',
      TBYTE_CTL      => "FALSE",
      TBYTE_SRC      => "FALSE",
      TRISTATE_WIDTH => 1
   )
   port map (
      OFB       => open,
      OQ        => open,
      SHIFTOUT1 => shift1,
      SHIFTOUT2 => shift2,
      TBYTEOUT  => open,
      TFB       => open,
      TQ        => open,
      CLK       => clk_x5,
      CLKDIV    => clk,
      D1       => '0',
      D2       => '0',
      D3       => data(8),
      D4       => data(9),
      D5       => '0',
      D6       => '0',
      D7       => '0',
      D8       => '0',
      OCE      => ce_delay,
      RST      => reset,
      SHIFTIN1 => '0',
      SHIFTIN2 => '0',
      T1       => '0',
      T2       => '0',
      T3       => '0',
      T4       => '0',
      TBYTEIN  => '0',
      TCE      => '0'
   );

delay_ce: process(clk)
    begin
        if rising_edge(clk) then
            ce_delay <= not reset;
        end if;
    end process;

    -- Генерация основного тактового сигнала (clk 50 MHz)
    clock_process : process
    begin
        while true loop
            clk <= '0';
            wait for clock_period / 4;
            clk <= '1';
            wait for clock_period / 4;
        end loop;
    end process;

    -- Генерация быстрого тактового сигнала (clk_x5 250 MHz)
    clock_x5_process : process
    begin
        while true loop
            clk_x5 <= '0';
            wait for clock_period / 20;
            clk_x5 <= '1';
            wait for clock_period / 20;
        end loop;
    end process;

    -- Стимулы для тестирования
    stim_proc: process
    begin
        wait for 200 ns;
        reset <= '0';  -- Сброс выключен
        wait for 200 ns;
       
        -- Подача данных на вход
        data <= "1100010101";  -- 10'h315 в двоичном формате
        wait for 20 ns;
        data <= "1100110101";  -- 10'h335
        wait for 20 ns;
        data <= "1100100100";  -- 10'h324
        wait for 20 ns;

        wait;
    end process;

END behavior;

 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2909
  • Country: ca
Re: OSERDES output data delay
« Reply #1 on: April 02, 2025, 01:25:06 pm »
Your clocks are aligned differently in those two tests. And I hope you do know that this is NOT the proper way to clock these blocks - you HAVE  to use MCMM/PLL to generate phase-aligned clocks.
Another thing - as per spec, you need to reset these blocks in the beginning before you can use them so that you can synchronize several blocks as otherwise their internal state might not be aligned.
 
The following users thanked this post: pardo-bsso, glenenglish

Offline alexeyey_0Topic starter

  • Newbie
  • Posts: 5
  • Country: ru
Re: OSERDES output data delay
« Reply #2 on: April 02, 2025, 01:36:41 pm »
Your clocks are aligned differently in those two tests. And I hope you do know that this is NOT the proper way to clock these blocks - you HAVE  to use MCMM/PLL to generate phase-aligned clocks.
Another thing - as per spec, you need to reset these blocks in the beginning before you can use them so that you can synchronize several blocks as otherwise their internal state might not be aligned.
How can my clocks be aligned differently if I look at the timeline and they are the same?
I'll try to connect the MCMM and see what happens, thanks.
I reset these blocks at the beginning: initialize reset 1 and then lower it to 0
 

Offline alexeyey_0Topic starter

  • Newbie
  • Posts: 5
  • Country: ru
Re: OSERDES output data delay
« Reply #3 on: April 02, 2025, 01:51:35 pm »
Your clocks are aligned differently in those two tests. And I hope you do know that this is NOT the proper way to clock these blocks - you HAVE  to use MCMM/PLL to generate phase-aligned clocks.
Another thing - as per spec, you need to reset these blocks in the beginning before you can use them so that you can synchronize several blocks as otherwise their internal state might not be aligned.
How can my clocks be aligned differently if I look at the timeline and they are the same?
I'll try to connect the MCMM and see what happens, thanks.
I reset these blocks at the beginning: initialize reset 1 and then lower it to 0


Nothing changed

1. sv tb (added mcmm)

Code: [Select]
`timescale 1ns / 1ps

module tb_dvi;

  logic       clk     = 0;
  logic [9:0] data    = 10'b0;
  logic       reset   = 1;
  logic       serial;

  logic       shift1;
  logic       shift2;
  logic       ce_delay;
  logic [7:0] reset_delay;

  logic       clk_50;
  logic       clk_250;
  logic       clkfb;

  localparam time clock_period = 10;

  MMCME2_BASE #(
    .BANDWIDTH          ( "OPTIMIZED" ),
    .DIVCLK_DIVIDE      ( 1           ),
    .CLKFBOUT_MULT_F    ( 10.0        ),
    .CLKFBOUT_PHASE     ( 0.0         ),
    .CLKIN1_PERIOD      ( 10.0        ),
    .CLKOUT0_DIVIDE_F   ( 1.0         ),
    .CLKOUT1_DIVIDE     ( 20          ),
    .CLKOUT2_DIVIDE     ( 4           ),
    .CLKOUT3_DIVIDE     ( 1           ),
    .CLKOUT4_DIVIDE     ( 1           ),
    .CLKOUT5_DIVIDE     ( 1           ),
    .CLKOUT6_DIVIDE     ( 1           ),
    .CLKOUT0_DUTY_CYCLE ( 0.5         ),
    .CLKOUT1_DUTY_CYCLE ( 0.5         ),
    .CLKOUT2_DUTY_CYCLE ( 0.5         ),
    .CLKOUT3_DUTY_CYCLE ( 0.5         ),
    .CLKOUT4_DUTY_CYCLE ( 0.5         ),
    .CLKOUT5_DUTY_CYCLE ( 0.5         ),
    .CLKOUT6_DUTY_CYCLE ( 0.5         ),
    .CLKOUT0_PHASE      ( 0.0         ),
    .CLKOUT1_PHASE      ( 0.0         ),
    .CLKOUT2_PHASE      ( 0.0         ),
    .CLKOUT3_PHASE      ( 0.0         ),
    .CLKOUT4_PHASE      ( 0.0         ),
    .CLKOUT5_PHASE      ( 0.0         ),
    .CLKOUT6_PHASE      ( 0.0         ),
    .CLKOUT4_CASCADE    ( "FALSE"     ),
    .REF_JITTER1        ( 0.0         ),
    .STARTUP_WAIT       ( "FALSE"     )
  ) MMCME2_BASE_inst (
    .CLKOUT0   (              ),
    .CLKOUT0B  (              ),
    .CLKOUT1   ( clk_50       ),
    .CLKOUT1B  (              ),
    .CLKOUT2   ( clk_250      ),
    .CLKOUT2B  (              ),
    .CLKOUT3   (              ),
    .CLKOUT3B  (              ),
    .CLKOUT4   (              ),
    .CLKOUT5   (              ),
    .CLKOUT6   (              ),
    .CLKFBOUT  ( clkfb        ),
    .CLKFBOUTB (              ),
    .LOCKED    ( locked_o     ),
    .CLKIN1    ( clk          ),
    .PWRDWN    ( 0            ),
    .RST       ( 0            ),
    .CLKFBIN   ( clkfb        )
  );

  // Master OSERDESE2
  OSERDESE2 #(
    .DATA_RATE_OQ   ( "DDR"    ),
    .DATA_RATE_TQ   ( "DDR"    ),
    .DATA_WIDTH     ( 10       ),
    .INIT_OQ        ( 1'b1     ),
    .INIT_TQ        ( 1'b1     ),
    .SERDES_MODE    ( "MASTER" ),
    .SRVAL_OQ       ( 1'b0     ),
    .SRVAL_TQ       ( 1'b0     ),
    .TBYTE_CTL      ( "FALSE"  ),
    .TBYTE_SRC      ( "FALSE"  ),
    .TRISTATE_WIDTH ( 1        )
  ) master_serdes (
    .OFB            (          ),
    .OQ             ( serial   ),
    .SHIFTOUT1      (          ),
    .SHIFTOUT2      (          ),
    .TBYTEOUT       (          ),
    .TFB            (          ),
    .TQ             (          ),
    .CLK            ( clk_250  ),
    .CLKDIV         ( clk_50   ),
    .D1             ( data[0]  ),
    .D2             ( data[1]  ),
    .D3             ( data[2]  ),
    .D4             ( data[3]  ),
    .D5             ( data[4]  ),
    .D6             ( data[5]  ),
    .D7             ( data[6]  ),
    .D8             ( data[7]  ),
    .OCE            ( ce_delay ),
    .RST            ( reset    ),
    .SHIFTIN1       ( shift1   ),
    .SHIFTIN2       ( shift2   ),
    .T1             ( 1'b0     ),
    .T2             ( 1'b0     ),
    .T3             ( 1'b0     ),
    .T4             ( 1'b0     ),
    .TBYTEIN        ( 1'b0     ),
    .TCE            ( 1'b0     )
  );

  // Slave OSERDESE2
  OSERDESE2 #(
    .DATA_RATE_OQ   ( "DDR"   ),
    .DATA_RATE_TQ   ( "DDR"   ),
    .DATA_WIDTH     ( 10      ),
    .INIT_OQ        ( 1'b1    ),
    .INIT_TQ        ( 1'b1    ),
    .SERDES_MODE    ( "SLAVE" ),
    .SRVAL_OQ       ( 1'b0    ),
    .SRVAL_TQ       ( 1'b0    ),
    .TBYTE_CTL      ( "FALSE" ),
    .TBYTE_SRC      ( "FALSE" ),
    .TRISTATE_WIDTH ( 1       )
  ) slave_serdes (
    .OFB            (          ),
    .OQ             (          ),
    .SHIFTOUT1      ( shift1   ),
    .SHIFTOUT2      ( shift2   ),
    .TBYTEOUT       (          ),
    .TFB            (          ),
    .TQ             (          ),
    .CLK            ( clk_250  ),
    .CLKDIV         ( clk_50   ),
    .D1             ( 1'b0     ),
    .D2             ( 1'b0     ),
    .D3             ( data[8]  ),
    .D4             ( data[9]  ),
    .D5             ( 1'b0     ),
    .D6             ( 1'b0     ),
    .D7             ( 1'b0     ),
    .D8             ( 1'b0     ),
    .OCE            ( ce_delay ),
    .RST            ( reset    ),
    .SHIFTIN1       ( 1'b0     ),
    .SHIFTIN2       ( 1'b0     ),
    .T1             ( 1'b0     ),
    .T2             ( 1'b0     ),
    .T3             ( 1'b0     ),
    .T4             ( 1'b0     ),
    .TBYTEIN        ( 1'b0     ),
    .TCE            ( 1'b0     )
  );

  always_ff @( posedge clk ) begin
    ce_delay <= ~reset;
  end

  initial forever #( clock_period / 2 )  clk = ~clk;

  // initial forever #( clock_period / 4 )  clk = ~clk;

  // initial forever #( clock_period / 20 ) clk_x5 = ~clk_x5;

  initial begin
    #200;
    reset = 0;
    #200;
    data  = 10'b1100010101; // 10'h315
    #20;
    data  = 10'b1100110101; // 10'h335
    #20;
    data  = 10'b1100100100; // 10'h324
    #20;
    $finish;
  end

endmodule


2. vhdl tb (added mcmm)

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
library UNISIM;
use UNISIM.VComponents.all;

ENTITY tb_ser IS
END tb_ser;

  ARCHITECTURE behavior OF tb_ser IS
    signal   clk          : std_logic                    := '0';
    signal   data         : std_logic_vector(9 downto 0) := (others => '0');
    signal   reset        : std_logic                    := '1';
    signal   serial       : std_logic                    := '0';

    signal   shift1       : std_logic                    := '0';
    signal   shift2       : std_logic                    := '0';
    signal   ce_delay     : std_logic                    := '0';
    signal   reset_delay  : std_logic_vector(7 downto 0) := (others => '0');

    constant clock_period : time                         := 10 ns;

    signal   clk_50       : std_logic                    := '0';
    signal   clk_250      : std_logic                    := '0';
    signal   clkfb        : std_logic                    := '0';
    signal   locked       : std_logic                    := '0';
  BEGIN

  MMCME2_BASE_inst : MMCME2_BASE
    generic map (
        BANDWIDTH          => "OPTIMIZED",
        DIVCLK_DIVIDE      => 1,
        CLKFBOUT_MULT_F    => 10.0,
        CLKFBOUT_PHASE     => 0.0,
        CLKIN1_PERIOD      => 10.0,
        CLKOUT0_DIVIDE_F   => 1.0,
        CLKOUT1_DIVIDE     => 20,
        CLKOUT2_DIVIDE     => 4,
        CLKOUT3_DIVIDE     => 1,
        CLKOUT4_DIVIDE     => 1,
        CLKOUT5_DIVIDE     => 1,
        CLKOUT6_DIVIDE     => 1,
        CLKOUT0_DUTY_CYCLE => 0.5,
        CLKOUT1_DUTY_CYCLE => 0.5,
        CLKOUT2_DUTY_CYCLE => 0.5,
        CLKOUT3_DUTY_CYCLE => 0.5,
        CLKOUT4_DUTY_CYCLE => 0.5,
        CLKOUT5_DUTY_CYCLE => 0.5,
        CLKOUT6_DUTY_CYCLE => 0.5,
        CLKOUT0_PHASE      => 0.0,
        CLKOUT1_PHASE      => 0.0,
        CLKOUT2_PHASE      => 0.0,
        CLKOUT3_PHASE      => 0.0,
        CLKOUT4_PHASE      => 0.0,
        CLKOUT5_PHASE      => 0.0,
        CLKOUT6_PHASE      => 0.0,
        CLKOUT4_CASCADE    => FALSE,
        REF_JITTER1        => 0.0,
        STARTUP_WAIT       => FALSE
    ) port map (
        CLKOUT0   => open,
        CLKOUT0B  => open,
        CLKOUT1   => clk_50,
        CLKOUT1B  => open,
        CLKOUT2   => clk_250,
        CLKOUT2B  => open,
        CLKOUT3   => open,
        CLKOUT3B  => open,
        CLKOUT4   => open,
        CLKOUT5   => open,
        CLKOUT6   => open,
        CLKFBOUT  => clkfb,
        CLKFBOUTB => open,
        LOCKED    => locked,
        CLKIN1    => clk,
        PWRDWN    => '0',
        RST       => '0',
        CLKFBIN   => clkfb
    );

  master_serdes : OSERDESE2
    generic map (
      DATA_RATE_OQ   => "DDR",
      DATA_RATE_TQ   => "DDR",
      DATA_WIDTH     => 10,
      INIT_OQ        => '1',
      INIT_TQ        => '1',
      SERDES_MODE    => "MASTER",
      SRVAL_OQ       => '0',
      SRVAL_TQ       => '0',
      TBYTE_CTL      => "FALSE",
      TBYTE_SRC      => "FALSE",
      TRISTATE_WIDTH => 1
    ) port map (
      OFB       => open,
      OQ        => serial,
      SHIFTOUT1 => open,
      SHIFTOUT2 => open,
      TBYTEOUT  => open,
      TFB       => open,
      TQ        => open,
      CLK       => clk_250,
      CLKDIV    => clk_50,
      D1        => data(0),
      D2        => data(1),
      D3        => data(2),
      D4        => data(3),
      D5        => data(4),
      D6        => data(5),
      D7        => data(6),
      D8        => data(7),
      OCE       => ce_delay,
      RST       => reset,
      SHIFTIN1  => SHIFT1,
      SHIFTIN2  => SHIFT2,
      T1        => '0',
      T2        => '0',
      T3        => '0',
      T4        => '0',
      TBYTEIN   => '0',
      TCE       => '0'
    );

  slave_serdes : OSERDESE2
    generic map (
      DATA_RATE_OQ   => "DDR",
      DATA_RATE_TQ   => "DDR",
      DATA_WIDTH     => 10,
      INIT_OQ        => '1',
      INIT_TQ        => '1',
      SERDES_MODE    => "SLAVE",
      SRVAL_OQ       => '0',
      SRVAL_TQ       => '0',
      TBYTE_CTL      => "FALSE",
      TBYTE_SRC      => "FALSE",
      TRISTATE_WIDTH => 1
    ) port map (
      OFB       => open,
      OQ        => open,
      SHIFTOUT1 => shift1,
      SHIFTOUT2 => shift2,
      TBYTEOUT  => open,
      TFB       => open,
      TQ        => open,
      CLK       => clk_250,
      CLKDIV    => clk_50,
      D1        => '0',
      D2        => '0',
      D3        => data(8),
      D4        => data(9),
      D5        => '0',
      D6        => '0',
      D7        => '0',
      D8        => '0',
      OCE       => ce_delay,
      RST       => reset,
      SHIFTIN1  => '0',
      SHIFTIN2  => '0',
      T1        => '0',
      T2        => '0',
      T3        => '0',
      T4        => '0',
      TBYTEIN   => '0',
      TCE       => '0'
    );

  delay_ce: process(clk) begin
      if rising_edge(clk) then
        ce_delay <= not reset;
      end if;
    end process;

  clock_process : process begin
    while true loop
      clk <= '0';
      wait for clock_period / 2;
      clk <= '1';
      wait for clock_period / 2;
    end loop;
  end process;

  -- clock_process : process begin
  --   while true loop
  --     clk <= '0';
  --     wait for clock_period / 4;
  --     clk <= '1';
  --     wait for clock_period / 4;
  --   end loop;
  -- end process;

  -- clock_x5_process : process begin
  --   while true loop
  --     clk_x5 <= '0';
  --     wait for clock_period / 20;
  --     clk_x5 <= '1';
  --     wait for clock_period / 20;
  --   end loop;
  -- end process;

  stim_proc: process begin
    wait for 200 ns;
    reset <= '0';
    wait for 200 ns;
    data <= "1100010101";  -- 10'h315
    wait for 20 ns;
    data <= "1100110101";  -- 10'h335
    wait for 20 ns;
    data <= "1100100100";  -- 10'h324
    wait for 20 ns;
    wait;
  end process;

END behavior;

 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2909
  • Country: ca
Re: OSERDES output data delay
« Reply #4 on: April 02, 2025, 02:18:45 pm »
from UG471:
Quote
Every OSERDESE2 in a multiple bit output structure should therefore be driven by the same reset signal, asserted asynchronously, and deasserted synchronously to CLKDIV to ensure that all OSERDESE2 elements come out of reset in synchronization. The reset signal should
only be deasserted when it is known that CLK and CLKDIV are stable and present.
You need to keep OSERDES in reset at least until "locked" signal of MCMM goes "high" - which indicates that MCMM is locked on the incoming clock successfully, and output clocks are stable and match requested parameters.

Offline alexeyey_0Topic starter

  • Newbie
  • Posts: 5
  • Country: ru
Re: OSERDES output data delay
« Reply #5 on: April 02, 2025, 02:31:28 pm »
Still have a delay on sv and dont have on vhdl.
I try this:
1. sv

Code: [Select]
    .OCE            ( ce_delay && locked_o  ),
    .RST            ( reset    || !locked_o ),

2. vhdl

Code: [Select]
      OCE       => ce_delay and locked,
      RST       => reset or not locked,

I replaced it with this in both the master and the slave
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2909
  • Country: ca
Re: OSERDES output data delay
« Reply #6 on: April 02, 2025, 08:00:29 pm »
As per UG471, the latency of 10:1 DDR mode is 5 serial clocks ±1 clock if clocks are phase-aligned.
But why does that matter at all? In SERDES applications the only thing that ever matters is a relative alignment of several SERDES outputs between each other, so that for example forwarded clock would allow for proper framing in case of Display LVDS or DVI/HDMI protocols. Nobody cares when exactly output begins relatively to inputs as long as it's always the same for all primitives - and this is why reset is required to ensure proper syncronization.

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 16280
  • Country: fr
Re: OSERDES output data delay
« Reply #7 on: April 02, 2025, 09:58:01 pm »
0.1 ps of delay, shocking!!
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28724
  • Country: nl
    • NCT Developments
Re: OSERDES output data delay
« Reply #8 on: April 02, 2025, 11:01:10 pm »
This could be a simulation issue. When assigning a clock to a different clock signal in VHDL, the edge will appear 1 clock cycle later. This has to do with how the simulator deals with simulation events. Maybe Verilog is simulated differently.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2828
  • Country: nz
Re: OSERDES output data delay
« Reply #9 on: April 03, 2025, 03:56:04 am »
I took your SV and VHDL modules, and made them have common inputs and outputs, and then run them both at the same time from this top testbench:

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity tb_top is
end tb_top;

architecture Behavioral of tb_top is
    component tb_ser IS
    port (
        clk    : in  std_logic;
        reset  : in  std_logic;
        data   : in  std_logic_vector(9 downto 0);
        serial : out std_logic
    );
    END component;

    component tb_dvi IS
    port (
        clk    : in  std_logic;
        reset  : in  std_logic;
        data   : in  std_logic_vector(9 downto 0);
        serial : out std_logic
    );
    END component;
   
    signal clk         : std_logic := '0';
    signal serial_vhdl : std_logic;
    signal serial_sv   : std_logic;
    signal reset       : std_logic := '1';
    signal data        : std_logic_vector(9 downto 0) := "0101001111";
    signal difference  : std_logic;
begin

clk_proc: process
    begin
        clk <= '0';
        wait for 5 ns;
        clk <= '1';
        wait for 5 ns;
    end process;

stim_proc: process begin
    wait for 400 ns;
    reset <= '0';
    wait for 400 ns;
    data <= "1100010101";  -- 10'h315
    wait for 20 ns;
    data <= "1100110101";  -- 10'h335
    wait for 20 ns;
    data <= "1100100100";  -- 10'h324
    wait for 20 ns;
  end process;


-- VHDL impelemntation
i_tb_ser: tb_ser port map (
        clk    => clk,
        reset  => reset,
        data   => data,
        serial => serial_vhdl
    );

i_tb_dviz: tb_dvi port map (
        clk    => clk,
        reset  => reset,
        data   => data,
        serial => serial_sv
    );
   
   difference <= serial_vhdl xor serial_sv;
end Behavioral;

The simulation shows that after reset the VHDL and SV output are the same.

Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline alexeyey_0Topic starter

  • Newbie
  • Posts: 5
  • Country: ru
Re: OSERDES output data delay
« Reply #10 on: April 03, 2025, 07:57:54 am »
As per UG471, the latency of 10:1 DDR mode is 5 serial clocks ±1 clock if clocks are phase-aligned.
But why does that matter at all? In SERDES applications the only thing that ever matters is a relative alignment of several SERDES outputs between each other, so that for example forwarded clock would allow for proper framing in case of Display LVDS or DVI/HDMI protocols. Nobody cares when exactly output begins relatively to inputs as long as it's always the same for all primitives - and this is why reset is required to ensure proper syncronization.

0.1 ps of delay, shocking!!

I wouldn't have cared if I hadn't been asked to make a signal for verification. I can either make a conditional valid signal for debug only, or I can say count n number of clock cycles and get the data. But I can't do either of the first or the second, because the output data is not synchronized by clock.
« Last Edit: April 03, 2025, 08:20:30 am by alexeyey_0 »
 

Offline Tation

  • Regular Contributor
  • *
  • Posts: 147
  • Country: pt
Re: OSERDES output data delay
« Reply #11 on: April 03, 2025, 12:01:51 pm »
Use separate clocks for:
  • driving the DUT
  • generating stimuli
to ensure that stimuli are generated slightly after the DUT clock edge. Say CLK_stim is CLK_dut delayed by 1 ps.

I think that your issue is the typical delta-delay gotcha often seen in no-delay simulations.

NOTE: arrived at this topic by accident. The title is not descriptive IMHO.
« Last Edit: April 03, 2025, 12:08:54 pm by Tation »
 
The following users thanked this post: nctnico

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2909
  • Country: ca
Re: OSERDES output data delay
« Reply #12 on: April 03, 2025, 06:03:26 pm »
I wouldn't have cared if I hadn't been asked to make a signal for verification. I can either make a conditional valid signal for debug only, or I can say count n number of clock cycles and get the data. But I can't do either of the first or the second, because the output data is not synchronized by clock.
Last sentence makes no sense - output is always synchronized by some clock, and receiver will need to have this clock somehow in order to successfully receive the data. if it's not explicitly passed along (like in case of source-synchronous signals), receiver will either need to recover the clock if it's embedded into the signal (this is how most modern serial protocols work), or if receiver knows transmitting clock parameters in advance and can generate a local clock with those parameters - this typically requires dynamic link training to allow receiving to find best sampling points.

The goal of verification circuit is not to 100% codify intended behavior via a million of timed transition checks ("at X ps signal A needs to go HIGH"), but to simulate circuit that your DUT connects to and ensure it works properly with whatever outputs your DUT creates.

One more thing - functional verification does not simulate any internal delays (routing delays, logic delays, clock tree skew, clock-to-out delays, etc.), and so you can not use it to find out those delays - this is what post-p&r timing sims are for, but they are SLOOOOOOW (because they simulate at a gate level) and so you'd better really need this and know what you are doing, or you can easily waste enormous amount of time without achieving anything useful.
 
The following users thanked this post: Someone

Offline glenenglish

  • Frequent Contributor
  • **
  • Posts: 486
  • Country: au
  • RF engineer. AI6UM / VK1XX . Aviation pilot. MTBr
Re: OSERDES output data delay
« Reply #13 on: April 09, 2025, 09:21:25 pm »
try adding an extra layer of naming  in the test bench interface. IE DUTname <= aliasname <= testbenchName;

I used to find strange things , wrong clock edges/delays in Modelsim if I didnt  add an extra level of naming   (not extra registers)
I never got to the bottom of it  in the corner cases, just added extra level ... maybe someone can enlighten me. The Modelsim people thought it was something to do with the XIlinx Black Box interface between languages.

anyway :   As ASMI says....0) use an MMCM (the datasheet says you must, so you must)  1) put the resets in.    2) tie in the MMCM stable outputs.

check what it does in the ILA in real  hardware....   THEN

if it works, then check if it still behaves correctly if, in a running system, pull reset (warm start) and let it go (IE state recovered) . If it warm starts correctly, you probably have a usable system....

if my stuff doesnt run from a warm reset, then I know I have not taken care of all reset considerations or init paths....
« Last Edit: April 09, 2025, 09:25:16 pm by glenenglish »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf