Author Topic: A simple Clock Domain Crossing (CDC) strategy  (Read 1927 times)

0 Members and 1 Guest are viewing this topic.

Offline josuahTopic starter

  • Regular Contributor
  • *
  • Posts: 119
  • Country: fr
    • josuah.net
A simple Clock Domain Crossing (CDC) strategy
« on: May 19, 2022, 05:28:18 am »
Someone (not someone named "Someone" on this forum, someone else) did show me this paper:
http://web.cse.msu.edu/~cse820/readings/sutherlandMicropipelinesTuring.pdf

Long story (19 pages) short:
Two modules, each with its own clock domain are writing through one control wire each to the other module, as well as a data array of wires.
When the sender module changes its control wire "req", the other module adjusts its control wire "ack" likewise.
Before changing its wire, the sender module sets the data array.
Before adjusting its wire, the receiver module copies the data array.
That way, the data array is transferred when it is sure to be stable thanks to the control wires:

Code: [Select]
                  :   :   :   :   :   :   :   :   :   :   :   :
                __:_______________:_______________:______________
handshake_data  __X_______________X_______________X______________
                  :    _______________:   :   :   :    __________
handshake_req   ______/   :   :   :   \_______________/   :   :
                  :   :   :   :_______________:   :   :   :   :__
handshake_ack   ______________/   :   :   :   \_______________/
                  :   :   :   :   :   :   :   :   :   :   :   :
                 (1) (2) (3) (4) (1) (2) (3) (4) (1) (2) (3) (4)

  • When the source has data to transfer, it first asserts `handshake_data` to the data to transfer (1) then inverts `handshake_req` (2).
  • Once the destination notices it, it copies `handshake_data` to a local register (3) then sets `handshake_ack` to the same value as `handshake_req` (4).

If I got it right, it should give us something like this for the source that exports data:

See comment below: handshake_ack_x is one flip flop, there need to be two in series!

Code: [Select]
module clock_domain_export #(
parameter SIZE = 8
) (
input wire clk,

// data submission
input wire [SIZE-1:0] data,
input wire stb,
output wire ready,

// handshake with the other clock domain
output reg [SIZE-1:0] handshake_data,
output reg handshake_req,
input wire handshake_ack
);
reg handshake_ack_x;

assign ready = (handshake_ack_x == handshake_req);

always @(posedge clk) begin
// prevent metastable state propagation
handshake_ack_x <= handshake_ack;

if (ready && stb) begin
handshake_data <= data;
handshake_req <= !handshake_req;
end
end
endmodule

And this on the other clock domain, importing the data:

See comment below: handshake_req_x is one flip flop, there need to be two in series!

Code: [Select]
module clock_domain_import #(
        parameter SIZE = 8
) (
        input wire clk,

        // data reception
        output reg [SIZE-1:0] data,
        output reg stb,

        // handshake with the other clock domain
        input wire [SIZE-1:0] handshake_data,
        input wire handshake_req,
        output reg handshake_ack
);
        reg handshake_req_x = 0;

        always @(posedge clk) begin
                // prevent metastable state propagation
                handshake_req_x <= handshake_req;

                stb <= 0;
                if (handshake_req_x != handshake_ack) begin
                        data <= handshake_data;
                        stb <= 1;
                        handshake_ack <= handshake_req_x;
                end
        end
endmodule

This seems to work rather well in simulation, but I am surprised by the size of the code required.
In the end, CDC would be one of these topic hard to be proven right and debug (works on the test bench, fails on the client's hands, works again when returned back), but not necessarily requiring complex code?

What do you think with my naive approach?

Related post: https://www.eevblog.com/forum/fpga/learning-clock-domain-transitions-on-fpgas-(xilinx)/msg1286966/#msg1286966
« Last Edit: May 19, 2022, 09:56:48 am by josuah »
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11234
  • Country: us
    • Personal site
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #1 on: May 19, 2022, 05:56:01 am »
Overall,l this is a very typical and standard way to synchronize a single word transfer.

But your implementation is not correct. Both REQ and ACK need to be 2FF synchronized to the target clock domain. You can't simply sample a signal generated in one domain with a clock from another domain. You will run into cases where setup/hold times are not met. 2FF synchronizer will prevent this.

Your single register (handshake_req_x) does not prevent meta-stability, and that potentially meta-stable level gets used in the logic. This is not going to work.

Doing those synchronization slows down the transfer a lot, and this method is not very useful for sustained transfers. In this case FIFO with Gray counter pointers is used.

Edit: wow, that's some vintage article.
« Last Edit: May 19, 2022, 06:03:06 am by ataradov »
Alex
 

Offline AndyC_772

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
  • Professional design engineer
    • Cawte Engineering | Reliable Electronics
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #2 on: May 19, 2022, 06:13:38 am »
I use this technique all the time. It's ideal for things like register settings, where data is written to an SPI slave interface in the FPGA, but must end up in the same clock domain as the bulk of the FPGA's internal logic.

Even without any data being transferred at all, it's still a useful handshake mechanism for two processes to communicate an event to each other. One changes the state of a signal to indicate that 'something has happened'; the other detects that the signal and its corresponding ack are different, performs some action, and sets the value of the ack so that the two are equal again. The actual state (ie. whether they're both 1 or both 0) doesn't matter.

It's a handy workaround for the fact that a signal can't be set in more than one place. In conventional software, we'd just use a flag which is set in one process and cleared in another, but in hardware we effectively manipulate the two inputs of an XOR gate to set the output to a desired value.

Offline josuahTopic starter

  • Regular Contributor
  • *
  • Posts: 119
  • Country: fr
    • josuah.net
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #3 on: May 19, 2022, 09:53:06 am »
It's ideal for things like register settings, where data is written to an SPI slave interface in the FPGA, but must end up in the same clock domain as the bulk of the FPGA's internal logic.

This is my use-case! Well, for a Wishbone master driven by SPI...

manipulate the two inputs of an XOR gate to set the output to a desired value

A very concise summary of what happens.

But your implementation is not correct. Both REQ and ACK need to be 2FF synchronized to the target clock domain. You can't simply sample a signal generated in one domain with a clock from another domain. You will run into cases where setup/hold times are not met. 2FF synchronizer will prevent this.

Your single register (handshake_req_x) does not prevent meta-stability, and that potentially meta-stable level gets used in the logic. This is not going to work.

Thank you a lot! The fact there was only one flip flop slipped through somehow. One of these cases that could not be caught by simulation.

Doing those synchronization slows down the transfer a lot, and this method is not very useful for sustained transfers. In this case FIFO with Gray counter pointers is used.

If I got it right, A FIFO with Gray Counter would still need a flip-flop pair but permit better throughput. I will read more on that.
 

Offline AndyC_772

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
  • Professional design engineer
    • Cawte Engineering | Reliable Electronics
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #4 on: May 19, 2022, 10:13:22 am »
The nice thing about using a FIFO is that your FPGA's tool chain almost certainly includes this functionality. You get to create and instantiate a FIFO component with the desired depth, width and logic signals (read request / acknowledge, full and empty flags etc), and the tools take care of the underlying logic.

It's academically interesting to note that there's double sampling and conversion to and from Gray codes going on in there somewhere, but you never have to actually see or do these things yourself.
 
The following users thanked this post: Someone

Offline Someone

  • Super Contributor
  • ***
  • Posts: 4525
  • Country: au
    • send complaints here
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #5 on: May 19, 2022, 10:48:40 am »
Also, for all the overhead and latency/performance compromise of a full handshake domain crossing you only gain some robustness over a simple toggle transfer. Given the choice of handshake vs double the stages in a toggle, I'll take the toggle.

The nice thing about using a FIFO is that your FPGA's tool chain almost certainly includes this functionality.
Applies to domain crossing of pluses/resets too. If the vendor offers a library/synthesised black box leveraging their primitives then use!
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11234
  • Country: us
    • Personal site
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #6 on: May 19, 2022, 04:46:50 pm »
If I got it right, A FIFO with Gray Counter would still need a flip-flop pair but permit better throughput.
Yes, you need to synchronize the counter value the same way via 2FF. Usually you actually implement a regular counter in the source clock domain, but then convert the value into the Gray code before domain crossing. Binary to Gray encoder is a trivial asynchronous logic.

And FIFO is there to absorb the extra synchronization latency. So, you never need a FIFO that is deeper than two way trip synchronization delay. You may use more for application needs, but the minimum requirement is typically only 5 entries deep or so.  I just round it up to 8 to not mess with non-power of two counters.
Alex
 

Offline josuahTopic starter

  • Regular Contributor
  • *
  • Posts: 119
  • Country: fr
    • josuah.net
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #7 on: May 19, 2022, 05:32:27 pm »

I am taking note to give a good look at what there is in the toy chest named toolchain.
Also, things like Zynq seems to pack a hefty amount of hard cores (as Dave and Jack Ganssle said, don't Google it!).

https://www.xilinx.com/content/dam/xilinx/imgs/block-diagrams/zynq-mp-core-single.png

Yes, you need to synchronize the counter value the same way via 2FF. Usually you actually implement a regular counter in the source clock domain, but then convert the value into the Gray code before domain crossing. Binary to Gray encoder is a trivial asynchronous logic.

And FIFO is there to absorb the extra synchronization latency. So, you never need a FIFO that is deeper than two way trip synchronization delay. You may use more for application needs, but the minimum requirement is typically only 5 entries deep or so.  I just round it up to 8 to not mess with non-power of two counters.

It now looks much less hairy than I thought at first sight.

Thanks all!
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3709
  • Country: us
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #8 on: May 19, 2022, 06:27:43 pm »
The simple data handshake above (properly implemented) is basically just a 1 element FIFO.  With 1 element the read/write pointers are only 1 bit and trivially grey coded.   If you take that handshake module and use N of them in parallel with muxes at the input/output then you can increase the throughput.  A bunch of data registers with muxes is just a RAM with a read and write port.  However, repeating the req/ack synchronizer logic for each element is inefficient.  It is basically a thermometer encoding of the read/write pointers.  Replacing those with grey codes and you have a regular FIFO.  So there isn't really a disconnect between using a FIFO and using a "simple" handshake, it's just two versions of the same idea.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 26868
  • Country: nl
    • NCT Developments
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #9 on: May 20, 2022, 06:06:24 pm »
In addition to all the answers: applying the proper timing constraints is crucial when the handshake signals are used to transfer of data between the two clock domains. Typically you'll need to set a constraint that ensures that the data goes from the source domain to the destination domain within the clock period of the destination domain. Don't assume synthesis tools will create such a constraint by itself.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline josuahTopic starter

  • Regular Contributor
  • *
  • Posts: 119
  • Country: fr
    • josuah.net
Re: A simple Clock Domain Crossing (CDC) strategy
« Reply #10 on: June 12, 2022, 04:40:50 pm »
For those diving the archives: See also a more recent thread, with some answers suggesting using some more of the vendor-specific data, maybe timing-constraint related...

https://www.eevblog.com/forum/fpga/verilog-101-clock-domain-crossing/

This seems in accordance with:

In addition to all the answers: applying the proper timing constraints is crucial when the handshake signals are used to transfer of data between the two clock domains. Typically you'll need to set a constraint that ensures that the data goes from the source domain to the destination domain within the clock period of the destination domain. Don't assume synthesis tools will create such a constraint by itself.

But I have the impression this is not about transferring the control signal themself, but rather about making sure the data reaches through faster than the control signal, so the data is stable when it is read. As in the image below courtesy of https://blogs.synopsys.com/from-silicon-to-software/2021/11/23/clock-domain-crossing-asic-design/
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf