IDE environment and tool chain I using can be found by
following this tutorial hereIt is based on OSS-CAD-Suite toolchain 2023-02-10 version and the Lushay Code plugin.
FPGA chip: GW1NR-9
Im just learning the verilog language and after following their tutorials and asic-world.com's readings, I want to write a fully configurable UART module so I can practice the grammar and syntax of the language. I want to include a pair of FIFO buffers so I wrote a memory module based on the ssram blocks in the chip I got. All that went well, just a few lines of code, so I got to thinking about how to stack the ssram blocks together into an array to make a adjustable size memory module. Simulations went well and I wrote the FIFO module. again, everything was great until it was not.
I quickly discovered some unexpected behavior, in simulation. I traced the errors to the module which I use as a wrapper for the memory, to add chip select features.
I little more detail about the design I chose. The basic memory module is wrapped in a module that will disable the clk and set the dataOut to high impedance when CS goes LOW. Doing this allowed me to share a common DataOut bus of wires in between all memory modules within the memory_array saving resources and maybe a little bit of power. When a read request is made, the DataOut will be valid on the next clock cycle.
but when I play with the CS port, Memory becomes available on the same clock cycle it is requested. Another set of eyes would be greatly appreciated in helping me see where I have made a mistake. I'm 99% sure data is not normally available on the same cycle it is requested. But if it is, this is great news to me.
Below is the minimal code needed to recreate the unexpected behavior and the test bench output. attached is the test bench.
shadow_ram.v// basic ssram semi-dual port 4bits wide 16 words deep max.
// datasheet: UG285E.pdf Page 55
// URL: http://cdn.gowinsemi.com.cn/UG285E.pdf
module ssram16dp
(
input wire clk,
input wire w_re,
input wire[3:0] raddress,
input wire[3:0] waddress,
input wire[3:0] dataIn,
output reg [3:0] dataOut
);
reg[3:0] ram [15:0];
always @(posedge clk) begin
if( w_re == 1 ) begin
ram[waddress] <= dataIn;
end else begin
dataOut <= ram[raddress];
end
end
endmodule
// ssram wrapper with high impedance dataOut and clk disable on chip select HIGH.
module ssram16dp_zstate
(
input wire clk,
input wire w_re,
input wire[3:0] raddress,
input wire[3:0] waddress,
input wire[3:0] dataIn,
output wire[3:0] dataOut,
input wire cs
);
wire sram_clk;
wire [3:0] sram_dataOut;
ssram16dp sram(
.clk( sram_clk ),
.w_re( w_re ),
.raddress( raddress ),
.waddress( waddress ),
.dataIn( dataIn ),
.dataOut( sram_dataOut )
);
assign sram_clk = cs & clk;
assign dataOut = cs ? sram_dataOut : 4'bzzzz;
endmodule
/* This module stack the basic blocks together in columns and rows to provide custom width and
depth. Increasing the width will be simple as each column module will provide data for its section
of output bits. Increasing the rows or depth will require the use of a multiplexer to select which
ram blocks will be accessed. A chip select option that disables the clock input, should save
power, the high impedance dataOut allows all banks to share a data buss.
*/
module ssram16dp_array #(
parameter width = 4,
parameter depth = 16
)
(
clk,
w_re,
raddress,
waddress,
dataIn,
dataOut
);
localparam array_width = width / 4;
localparam array_depth = depth / 16;
localparam address_width = $clog2(depth);
input wire clk;
input wire w_re;
input wire [address_width:0] raddress;
input wire [address_width:0] waddress;
input wire [width-1:0] dataIn;
output wire [width-1:0] dataOut;
reg [array_depth-1:0] chip_select;
genvar a, b;
generate
for( a = 0; a < array_depth; a = a + 1 ) begin
for( b = 0; b < array_width; b = b + 1 ) begin
ssram16dp_zstate memory_array(
.clk( clk ),
.w_re( w_re ),
.raddress( raddress[3:0] ),
.waddress( waddress[3:0] ),
.dataIn( dataIn [3+b*4 : b*4] ),
.dataOut( dataOut[3+b*4 : b*4] ),
.cs( chip_select[a] )
);
end
end
endgenerate
integer i;
always @( raddress or waddress or w_re ) begin
for( i = 0; i < array_depth; i = i + 1 ) begin
case ( w_re )
1: chip_select[i] <= ( waddress[address_width:4] == i ) ? 1 : 0;
0: chip_select[i] <= ( raddress[address_width:4] == i ) ? 1 : 0;
endcase
end
end
endmodule
Test outputStarting FPGA Toolchain
Starting Testbench with iVerilog
Finished Testbench
Starting Testbench with iVerilog
VCD info: dumpfile UUT.vcd opened for output.
starting test
Writing
1 cs:1 w_re:z DataIn: z waddy: z DataOut: x raddy: z
3 cs:1 w_re:1 DataIn: 0 waddy: 0 DataOut: x raddy: z
5 cs:1 w_re:1 DataIn: 1 waddy: 1 DataOut: x raddy: z
7 cs:1 w_re:1 DataIn: 2 waddy: 2 DataOut: x raddy: z
9 cs:1 w_re:1 DataIn: 3 waddy: 3 DataOut: x raddy: z
11 cs:1 w_re:1 DataIn: 4 waddy: 4 DataOut: x raddy: z
13 cs:1 w_re:1 DataIn: 5 waddy: 5 DataOut: x raddy: z
15 cs:1 w_re:1 DataIn: 6 waddy: 6 DataOut: x raddy: z
17 cs:1 w_re:1 DataIn: 7 waddy: 7 DataOut: x raddy: z
19 cs:1 w_re:1 DataIn: 8 waddy: 8 DataOut: x raddy: z
21 cs:1 w_re:1 DataIn: 9 waddy: 9 DataOut: x raddy: z
23 cs:1 w_re:1 DataIn: 10 waddy: 10 DataOut: x raddy: z
25 cs:1 w_re:1 DataIn: 11 waddy: 11 DataOut: x raddy: z
27 cs:1 w_re:1 DataIn: 12 waddy: 12 DataOut: x raddy: z
29 cs:1 w_re:1 DataIn: 13 waddy: 13 DataOut: x raddy: z
31 cs:1 w_re:1 DataIn: 14 waddy: 14 DataOut: x raddy: z
33 cs:1 w_re:1 DataIn: 15 waddy: 15 DataOut: x raddy: z
35 cs:1 w_re:1 DataIn: 15 waddy: 15 DataOut: x raddy: z
reading
37 cs:1 w_re:0 DataIn: z waddy: z DataOut: x raddy: 0
39 cs:1 w_re:0 DataIn: z waddy: z DataOut: 0 raddy: 1
41 cs:1 w_re:0 DataIn: z waddy: z DataOut: 1 raddy: 2
43 cs:1 w_re:0 DataIn: z waddy: z DataOut: 2 raddy: 3
45 cs:1 w_re:0 DataIn: z waddy: z DataOut: 3 raddy: 4
47 cs:1 w_re:0 DataIn: z waddy: z DataOut: 4 raddy: 5
49 cs:1 w_re:0 DataIn: z waddy: z DataOut: 5 raddy: 6
51 cs:1 w_re:0 DataIn: z waddy: z DataOut: 6 raddy: 7
53 cs:1 w_re:0 DataIn: z waddy: z DataOut: 7 raddy: 8
55 cs:1 w_re:0 DataIn: z waddy: z DataOut: 8 raddy: 9
57 cs:1 w_re:0 DataIn: z waddy: z DataOut: 9 raddy: 10
59 cs:1 w_re:0 DataIn: z waddy: z DataOut: 10 raddy: 11
61 cs:1 w_re:0 DataIn: z waddy: z DataOut: 11 raddy: 12
63 cs:1 w_re:0 DataIn: z waddy: z DataOut: 12 raddy: 13
65 cs:1 w_re:0 DataIn: z waddy: z DataOut: 13 raddy: 14
67 cs:1 w_re:0 DataIn: z waddy: z DataOut: 14 raddy: 15
intermitted cs
69 cs:1 w_re:0 DataIn: z waddy: z DataOut: 15 raddy: 15
71 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 0
73 cs:1 w_re:0 DataIn: z waddy: z DataOut: 1 raddy: 1
75 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 2
77 cs:1 w_re:0 DataIn: z waddy: z DataOut: 3 raddy: 3
79 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 4
81 cs:1 w_re:0 DataIn: z waddy: z DataOut: 5 raddy: 5
83 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 6
85 cs:1 w_re:0 DataIn: z waddy: z DataOut: 7 raddy: 7
87 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 8
89 cs:1 w_re:0 DataIn: z waddy: z DataOut: 9 raddy: 9
91 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 10
93 cs:1 w_re:0 DataIn: z waddy: z DataOut: 11 raddy: 11
95 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 12
97 cs:1 w_re:0 DataIn: z waddy: z DataOut: 13 raddy: 13
99 cs:0 w_re:0 DataIn: z waddy: z DataOut: z raddy: 14
101 cs:1 w_re:0 DataIn: z waddy: z DataOut: 15 raddy: 15
103 cs:1 w_re:0 DataIn: z waddy: z DataOut: 15 raddy: 15
reading
105 cs:1 w_re:0 DataIn: z waddy: z DataOut: 15 raddy: 0
107 cs:1 w_re:0 DataIn: z waddy: z DataOut: 0 raddy: 1
109 cs:1 w_re:0 DataIn: z waddy: z DataOut: 1 raddy: 2
111 cs:1 w_re:0 DataIn: z waddy: z DataOut: 2 raddy: 3
113 cs:1 w_re:0 DataIn: z waddy: z DataOut: 3 raddy: 4
115 cs:1 w_re:0 DataIn: z waddy: z DataOut: 4 raddy: 5
117 cs:1 w_re:0 DataIn: z waddy: z DataOut: 5 raddy: 6
119 cs:1 w_re:0 DataIn: z waddy: z DataOut: 6 raddy: 7
121 cs:1 w_re:0 DataIn: z waddy: z DataOut: 7 raddy: 8
123 cs:1 w_re:0 DataIn: z waddy: z DataOut: 8 raddy: 9
125 cs:1 w_re:0 DataIn: z waddy: z DataOut: 9 raddy: 10
127 cs:1 w_re:0 DataIn: z waddy: z DataOut: 10 raddy: 11
129 cs:1 w_re:0 DataIn: z waddy: z DataOut: 11 raddy: 12
131 cs:1 w_re:0 DataIn: z waddy: z DataOut: 12 raddy: 13
133 cs:1 w_re:0 DataIn: z waddy: z DataOut: 13 raddy: 14
135 cs:1 w_re:0 DataIn: z waddy: z DataOut: 14 raddy: 15
Test finished Properly
c:\Users\\Desktop\fpga_projects\Tang_Nano_9K_blink\src\uart\shadow_ram\ssram16dp_zstate_tb.v:124: $finish called at 137 (1s)
137 cs:1 w_re:0 DataIn: z waddy: z DataOut: 15 raddy: 15
Finished Testbench
Toolchain Completed