Electronics > FPGA

Simulating Memory in verilog

(1/1)

Rainwater:
IDE environment and tool chain I using can be found by following this tutorial here
It is based on OSS-CAD-Suite toolchain 2023-02-10 version and the Lushay Code plugin.
FPGA chip: GW1NR-9

Im just learning the verilog language and after following their tutorials and asic-world.com's readings, I want to write a fully configurable UART module so I can practice the grammar and syntax of the language. I want to include a pair of FIFO buffers so I wrote a memory module based on the ssram blocks in the chip I got. All that went well, just a few lines of code, so I got to thinking about how to stack the ssram blocks together into an array to make a adjustable size memory module. Simulations went well and I wrote the FIFO module. again, everything was great until it was not.
I quickly discovered some unexpected behavior, in simulation. I traced the errors to the module which I use as a wrapper for the memory, to add chip select features.

I little more detail about the design I chose. The basic memory module is wrapped in a module that will disable the clk and set the dataOut to high impedance when CS goes LOW. Doing this allowed me to share a common DataOut bus of wires in between all memory modules within the memory_array saving resources and maybe a little bit of power. When a read request is made, the DataOut will be valid on the next clock cycle.
but when I play with the CS port, Memory becomes available on the same clock cycle it is requested. Another set of eyes would be greatly appreciated in helping me see where I have made a mistake. I'm 99% sure data is not normally available on the same cycle it is requested. But if it is, this is great news to me.

 Below is the minimal code needed to recreate the unexpected behavior and the test bench output. attached is the test bench.
shadow_ram.v

--- Code: ---// basic ssram semi-dual port 4bits wide 16 words deep max.
// datasheet: UG285E.pdf Page 55
// URL: http://cdn.gowinsemi.com.cn/UG285E.pdf
module ssram16dp
(
    input   wire        clk,
    input   wire        w_re,
    input   wire[3:0]   raddress,
    input   wire[3:0]   waddress,
    input   wire[3:0]   dataIn,
    output  reg [3:0]   dataOut
);
    reg[3:0] ram [15:0];

    always @(posedge clk) begin
        if( w_re == 1 ) begin
            ram[waddress] <= dataIn;
        end else begin
            dataOut <= ram[raddress];
        end
    end
endmodule

// ssram wrapper with high impedance dataOut and clk disable on chip select HIGH.
module ssram16dp_zstate
(
    input   wire        clk,
    input   wire        w_re,
    input   wire[3:0]   raddress,
    input   wire[3:0]   waddress,
    input   wire[3:0]   dataIn,
    output  wire[3:0]   dataOut,
    input   wire        cs
);
    wire        sram_clk;
    wire [3:0]  sram_dataOut;

    ssram16dp sram(
        .clk(       sram_clk ),
        .w_re(      w_re ),
        .raddress(  raddress ),
        .waddress(  waddress ),
        .dataIn(    dataIn ),
        .dataOut(   sram_dataOut )
    );

    assign sram_clk = cs & clk;
    assign dataOut  = cs ? sram_dataOut : 4'bzzzz;
endmodule

/* This module stack the basic blocks together in columns and rows to provide custom width and
depth. Increasing the width will be simple as each column module will provide data for its section
of output bits. Increasing the rows or depth will require the use of a multiplexer to select which
ram blocks will be accessed. A chip select option that disables the clock input, should save
power, the high impedance dataOut allows all banks to share a data buss.
*/
module ssram16dp_array #(
    parameter width = 4,
    parameter depth = 16
)
(
    clk,
    w_re,
    raddress,
    waddress,
    dataIn,
    dataOut
);
    localparam array_width      = width / 4;
    localparam array_depth      = depth / 16;
    localparam address_width    = $clog2(depth);

    input wire                      clk;
    input wire                      w_re;
    input wire  [address_width:0]   raddress;
    input wire  [address_width:0]   waddress;
    input wire  [width-1:0]         dataIn;
    output wire [width-1:0]         dataOut;

    reg [array_depth-1:0]  chip_select;   
    genvar a, b;
    generate
        for( a = 0; a < array_depth; a = a + 1 ) begin
            for( b = 0; b < array_width; b = b + 1 ) begin
                ssram16dp_zstate memory_array(
                    .clk(       clk ),
                    .w_re(      w_re ),
                    .raddress(  raddress[3:0] ),
                    .waddress(  waddress[3:0] ),
                    .dataIn(    dataIn [3+b*4 : b*4] ),
                    .dataOut(   dataOut[3+b*4 : b*4] ),
                    .cs(        chip_select[a] )
                );
            end
        end
    endgenerate

    integer i;   
    always @( raddress or waddress or w_re ) begin
        for( i = 0; i < array_depth; i = i + 1 ) begin
            case ( w_re )
                1: chip_select[i] <= ( waddress[address_width:4] == i ) ? 1 : 0;
                0: chip_select[i] <= ( raddress[address_width:4] == i ) ? 1 : 0;
            endcase
        end   
    end
endmodule
--- End code ---

Test output

--- Code: ---Starting FPGA Toolchain
Starting Testbench with iVerilog
Finished Testbench
Starting Testbench with iVerilog
    VCD info: dumpfile UUT.vcd opened for output.
    starting test
    Writing
                       1 cs:1 w_re:z DataIn:  z waddy:  z DataOut:  x raddy:  z
                       3 cs:1 w_re:1 DataIn:  0 waddy:  0 DataOut:  x raddy:  z
                       5 cs:1 w_re:1 DataIn:  1 waddy:  1 DataOut:  x raddy:  z
                       7 cs:1 w_re:1 DataIn:  2 waddy:  2 DataOut:  x raddy:  z
                       9 cs:1 w_re:1 DataIn:  3 waddy:  3 DataOut:  x raddy:  z
                      11 cs:1 w_re:1 DataIn:  4 waddy:  4 DataOut:  x raddy:  z
                      13 cs:1 w_re:1 DataIn:  5 waddy:  5 DataOut:  x raddy:  z
                      15 cs:1 w_re:1 DataIn:  6 waddy:  6 DataOut:  x raddy:  z
                      17 cs:1 w_re:1 DataIn:  7 waddy:  7 DataOut:  x raddy:  z
                      19 cs:1 w_re:1 DataIn:  8 waddy:  8 DataOut:  x raddy:  z
                      21 cs:1 w_re:1 DataIn:  9 waddy:  9 DataOut:  x raddy:  z
                      23 cs:1 w_re:1 DataIn: 10 waddy: 10 DataOut:  x raddy:  z
                      25 cs:1 w_re:1 DataIn: 11 waddy: 11 DataOut:  x raddy:  z
                      27 cs:1 w_re:1 DataIn: 12 waddy: 12 DataOut:  x raddy:  z
                      29 cs:1 w_re:1 DataIn: 13 waddy: 13 DataOut:  x raddy:  z
                      31 cs:1 w_re:1 DataIn: 14 waddy: 14 DataOut:  x raddy:  z
                      33 cs:1 w_re:1 DataIn: 15 waddy: 15 DataOut:  x raddy:  z
                      35 cs:1 w_re:1 DataIn: 15 waddy: 15 DataOut:  x raddy:  z
    reading
                      37 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  x raddy:  0
                      39 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  0 raddy:  1
                      41 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  1 raddy:  2
                      43 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  2 raddy:  3
                      45 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  3 raddy:  4
                      47 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  4 raddy:  5
                      49 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  5 raddy:  6
                      51 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  6 raddy:  7
                      53 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  7 raddy:  8
                      55 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  8 raddy:  9
                      57 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  9 raddy: 10
                      59 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 10 raddy: 11
                      61 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 11 raddy: 12
                      63 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 12 raddy: 13
                      65 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 13 raddy: 14
                      67 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 14 raddy: 15
    intermitted cs
                      69 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 15 raddy: 15
                      71 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy:  0
                      73 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  1 raddy:  1
                      75 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy:  2
                      77 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  3 raddy:  3
                      79 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy:  4
                      81 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  5 raddy:  5
                      83 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy:  6
                      85 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  7 raddy:  7
                      87 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy:  8
                      89 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  9 raddy:  9
                      91 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy: 10
                      93 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 11 raddy: 11
                      95 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy: 12
                      97 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 13 raddy: 13
                      99 cs:0 w_re:0 DataIn:  z waddy:  z DataOut:  z raddy: 14
                     101 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 15 raddy: 15
                     103 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 15 raddy: 15
    reading
                     105 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 15 raddy:  0
                     107 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  0 raddy:  1
                     109 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  1 raddy:  2
                     111 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  2 raddy:  3
                     113 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  3 raddy:  4
                     115 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  4 raddy:  5
                     117 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  5 raddy:  6
                     119 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  6 raddy:  7
                     121 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  7 raddy:  8
                     123 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  8 raddy:  9
                     125 cs:1 w_re:0 DataIn:  z waddy:  z DataOut:  9 raddy: 10
                     127 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 10 raddy: 11
                     129 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 11 raddy: 12
                     131 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 12 raddy: 13
                     133 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 13 raddy: 14
                     135 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 14 raddy: 15
    Test finished Properly
    c:\Users\\Desktop\fpga_projects\Tang_Nano_9K_blink\src\uart\shadow_ram\ssram16dp_zstate_tb.v:124: $finish called at 137 (1s)
                     137 cs:1 w_re:0 DataIn:  z waddy:  z DataOut: 15 raddy: 15
Finished Testbench
Toolchain Completed

--- End code ---

BrianHG:
Here are a few things I have written which you may just read/use/learn and experiment with:

A few different FIFOs I've played with:
(There are 4 different FIFOs modules I have written in this 1 source code, only the last one called 'module BHG_FIFO_Xword_FWFT' would probably be of interest to you.  The first one is a configurable nasty sequential bucket shifter I created for insane FMAX speeds.)
https://github.com/BrianHGinc/BrianHG-DDR3-Controller/blob/main/BrianHG_DDR3/BrianHG_DDR3_FIFOs.sv


A simple synchronous UART for full duplex high speed RS232 with PCs:  (MCU's UARTS typically work without the bi-dir sync requirement, but they implement it anyways)
https://www.eevblog.com/forum/fpga/verilog-rs232-uart-and-rs232-debugger-source-code-and-educational-tutorial/

Note that they are written in SystemVerilog.

If you want to see simple simulation examples for some other tiny projects, look here:
https://www.eevblog.com/forum/fpga/verilog-floating-point-clock-divider-release/
https://www.eevblog.com/forum/fpga/bhg_i2c_init_rs232_debugger-an-i2c-initializer-with-integrated-rs232-debugger/

Rainwater:
After some sleep and studying the waveform I think I have figured out what is happening. The error only occurs 1 clock cycle after changing CS.
CS HIGH to LOW
    The previous write request(w_re=HIGH) will be completed.
    The previous read request(w_re=LOW) gets canceled.

CS LOW to HIGH
   A write request will complete.
   A read request will start.

This lead me to review my basic memory module, who's timing does NOT matching the timing diagrams in the datasheet. my major error is I assumed the timing of the memory, and did not check the datasheet.

5 minutes later....
Great news, data is available at the rising clock edge. Yeppy for me!!!
My errors, Gonna toss this up to being green.
1) so the data sheet says that the memory is "asynchronous" meaning it can read and write at the same time.
2) it samples on the negative clock edge.
my ssram16db module always block should look like this

--- Code: ---    always @(negedge clk) begin
        if( w_re == 1 )
            ram[waddress] <= dataIn;
        dataOut <= ram[raddress];
    end
--- End code ---
wham bam thank you ma'am. it works.
Totally doesnt work on the chip.
Simulation looks good tho.

Got it figured out. just had to read EVERY sentence in the datasheet regarding the timing. The following produces a result that matches the datasheet timing diagram.

--- Code: ---    assign dataOut = ram[raddress];

    always @( posedge clk )begin
        if( w_re == 1 )
            ram[waddress] <= dataIn;
--- End code ---

Navigation

[0] Message Index

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod