Author Topic: Gating the clock (Read 2109 times)

agehall · « **on:** February 26, 2024, 09:02:09 am »

So I'm not a HDL wizard by any means. I dip my toes in Verilog from time to time for hobby projects. One thing that I'm not sure I fully understand, is how to reason around clock signals.

In most literature I've read, it is a big no-no to gate the clock. But then they happily show how you can reduce the clock by implementing a counter and taking the output from the highest bit of the counter and using that as a clock for other parts of the system. Isn't that also gating the clock?

I'm sure this is a stupid question, but I'd really appreciate if someone could elaborate and explain a bit. I feel I'm missing something here...

woofy · « **Reply #1 on:** February 26, 2024, 10:00:56 am »

Not a stupid question at all, but a huge subject.
There is nothing wrong with clock gating, modern IC's do it all the time to reduce power consumption in inactive circuits. Also dividing a clock down to multiple frequencies is ok. The real issue is multiple clocks such as in your divider example, and passing data from one clock domain to another. You have to ensure the data produced by one clock is valid at the time a second clock processes it. There are many ways to deal with that, have a look here for starters:
https://www.fpga4fun.com/CrossClockDomain.html

Berni · « **Reply #2 on:** February 26, 2024, 10:16:33 am »

The reasoning for avoiding clock gating is a bit more in depth than a simple yes or no.

FPGAs are very good at clock distribution because it is a common task, so they have dedicated wires that carry clock around with minimal distortion and skew. However once you gate a clock it means it has to branch off from the dedicated clock distribution network and be carried trough regular logic wires. This causes extra delay skew and timing uncertainty, so the logic driven with that clock will not be as in sync and so it might not reach as high of a frequency before things break due to timing getting out of wack.

The clock distribution network does have more than 1 channel, so it can distribute multiple clocks, but there is still a limited number of dedicated wires (exactly how many depends on the FPGA family) so you want to keep the number of widely used clock signals down to a minimum. One way of doing this is avoiding clock gating.

Also in general the gate always adds a tiny bit of extra delay, so when the clock gated part of the logic interacts with other logic, it will be slightly behind in timing.

So it is not like clock gating should never ever ever be used. Just that it should be used as a last resort when other methods can't do what you need. Like for example disabling parts of a circuit is often more performant to disconnect the data input wires to the circuit, rather than to stop its clock, instead just leave the clock running constantly.

MarginallyStable · « **Reply #3 on:** February 26, 2024, 05:59:52 pm »

Using the output of a timer (or directly from a register) is more like generating a clock in my mind. Clock gating usually utilizes combinational logic (think 'and' gate with the clock as one input) and can be tricky to get right from a timing perspective. You can end up with undesirable glitches on the gated clock that do not meet minimum width requirements, etc. Many FPGAs do have dedicated clock gating blocks that help timing closure complete successfully. The usual recommended alternative is the use of clock enables that are synchronous to the clock, thus can go through normal timing closure, But this does little to save on energy consumption as the clocks still propagate to the disabled circuitry.

radar_macgyver · « **Reply #4 on:** February 26, 2024, 08:05:12 pm »

The issue becomes a bit more clear when you figure out why this is a recommendation. If one uses a divider implemented in general logic to generate a slower clock, the edges are no longer perfectly synchronous with the source clock. Additionally, the routing resources used for such general-purpose signals have routing delays that are hard to predict, especially over process, voltage and temperature.
Dedicated clock routing resources, on the other hand, are designed for low skew between any two points on the chip. They are have much better characterized routing delays, and these are baked into the tables that the place-and-route uses when solving the routing of an FPGA design.
Dedicated clock resources, at least on Xilinx chips (I don't have experience on others) offer clock dividers, PLLs, multiplexers and buffers with a disable control. These can be used to drive the clocks needed in a design, and also gate them off, for example to save power. One can certainly use a divider to generate a clock for something non-critical (eg: blink an LED), and usually the tools will warn you and then exclude such paths from the timing optimization. However, there are usually only a few such buffers and multiplexers on a given FPGA, and the global clock nets they drive are also limited resources (as @Berni said), so the usual approach is to have a synchronous design with one (or relatively few) clocks, taking advantage of the synchronous clock enable inputs on each FF. This is much easier for the tools to analyze and verify timing.
On Xilinx FPGAs, a 'BUFG' (global clock buffer) drives a global clock net. For most smaller FPGAs, there are maybe 8 or 16 global clock nets, each driven by its own BUFG. If you want to implement a gated clock, use a BUFGMUX (multiplexer variant of a BUFG), with one input set to '0', and the select lines of the BUFGMUX implementing the gating signal. The big difference between using a BUFGMUX and implementing the same thing in the general logic fabric is:
1. Each BUFGMUX drives its own dedicated global clock line, so the tools have a known delay when computing the clock skew. If one uses general fabric to do this, the tools will use general purpose routing to bring the gated clock to the target flip flops, and the routing delay for these nets is not easy to predict.
2. The BUFGMUX has extra logic built in to avoid generating runt pulses on the global clock lines. For example, if you're gating a clock with a signal that can transition close to one of the edges of the clock, it can produce a runt pulse that's a fraction of the period of the clock. BUFGMUXes include logic to avoid this.

Each Xilinx family of chips has a clock resources user guide where these and other topics are explored in detail.

agehall · « **Reply #5 on:** February 27, 2024, 06:13:01 am »

Thanks for all the responses! This really helps me formulate a better understanding of the problem.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Gating the clock (Read 2109 times)

agehall

Gating the clock

woofy

Re: Gating the clock

Berni

Re: Gating the clock

MarginallyStable

Re: Gating the clock

radar_macgyver

Re: Gating the clock

agehall

Re: Gating the clock

Share me