Alright, just for the sake of the discussion, I replaced this (#1):
always @(posedge clk) begin
if(!resetn) begin
cfg_reg <= 32'h 0000_0000;
cfg_wr_ready <= 0;
end else begin
if((reg_we != 4'b 0000) && (reg_addr == 4'b 0000)) begin
cfg_reg <= reg_di;
cfg_wr_ready <= 1;
end else begin
cfg_wr_ready <= 0;
end
end
end
with this (#2):
always @(posedge clk) begin
cfg_wr_ready <= 0;
cfg_reg <= cfg_reg; // This does look a bit funky...
if((reg_we != 4'b 0000) && (reg_addr == 4'b 0000)) begin
cfg_reg <= reg_di;
cfg_wr_ready <= 1;
end
if(!resetn) begin
cfg_reg <= 32'h 0000_0000;
cfg_wr_ready <= 0;
end
end
Based on the "last statement wins" this is the order of priority I'd think of, and both ways generate a bitstream and work just fine (simulation and real)... I assume they are technically the same? I still find the "something <= something" statement a bit unsettling... must be because I haven't done this in a while...
Not considering whether the "reset" MUST be there or not (let's say yes... for now)... is one way or the other "better" (that is very subjective, I know), or what is considered better practice?
I won't answer the "better practice" question, because I think there are too many opinions about this.
But in your second block, the assignment
cfg_reg <= cfg_reg; is not needed because the flip-flops hold their state unless there's a new assignment. Also the code describes
cfg_wr_ready as a one-shot strobe, asserted for only one clock tick, which is reasonable.
cfg_reg, assigned a new value at the same time, will hold that new value until the next time that register is addressed. There is no reason to clear it after each register write finishes, right?
Regarding "flip-flops hold their state unless assigned," let's discuss that for a moment. On every tick of the clock, the output of the flip-flop updates. It might not change state, but it updates.
The basic flip-flop has two synchronous inputs (ignoring the synchronous preset or clear). These inputs are the D (data) input, which is obvious, and the CE (clock enable) input. The CE controls whether what is on the D input appears on the Q output, or whether the previous output continues to drive. It's called a clock enable because once upon a time (and still, in certain cases), the input literally gated the clock. If CE was off, then the flop never saw the clock pulse, so the output would not change. FPGA flops have a "recirculating mux" built in. This is exactly what it says on the tin. The D input feeds one input to that mux, the flop's Q output feeds the other input, and the CE chooses which to clock into the flop. CE = 1 usually means D input.
The logic that drives the CE input is connected to CLB resources, just like the D input, so the logic equations that drive the CE input can be pretty complex. This is why when you code something you might thinks is an obvious CE, the synthesis can do something which results in the same behavior but with a different implementation.
For your
cfg_reg assignment then, you might expect the synthesizer to build a CE that is asserted true when
((reg_we != 4'b 0000) && (reg_addr == 4'b 0000)) is true, which allows the values on
reg_di to update the flops. When that write enable/address compare result is false, the CE is off so
cfg_reg doesn't change because the old value is fed back into the flop.
I didn't forget the synchronous preset and clear! The synthesis tool will use them, if they exist in your fabric (!), if any of your logic describes something that obviously sets or resets a flip-flop. You might see some complex logic driving the PRE or CLR signals in the same way you see it driving D or CE. Now, this is separate from the global reset. Your
if(!resetn) statement should invoke the global reset assuming
resetn is "global" in scope.
The crux here is that the synthesis tool will build logic out of the available resources to implement what you describe. The result should match what you get when you simulate your design. This is really all that matters, right? The logic generated may be "inefficient" or it might be clever. You really don't care until you don't meet timing constraints or you run out of resources. That's when you start looking into what the synthesis tool generated.
Good luck.