How about using a unity gain, fast recovery RRIO op amp as a limiting buffer? Set the power supply to the limits. The output should swing within few tens millivolts from the power supply.
This was presented earlier (see #50). It works if fixed limits are fine, but it's not practical to make it adjustable, and the inputs need to be within that band as well.
Or, use a fast recovery op amp driving a push-pull output creating unity gain buffer:
https://upload.wikimedia.org/wikipedia/en/thumb/1/1e/Pushpull.PNG/330px-Pushpull.PNG
Use MOSFETs instead of bipolar transistors. Set the MOSFET output stage voltages V+ and V- to the desired limits.
Ah, but that's the trick, isn't it?
If we always had "fast recovery" amps, like transconductance amps, we could just clamp the output directly -- this is what the AD8036 does internally. There's a transconductance gain stage, which provides a constant current output and extremely high voltage gain from the main input; precision clamps (wired in a similar manner) set the max and min ranges. The limits are freely adjustable, independent of both supply and input voltages, and recovery is rapid (with no overshoot or windup). Finally, a buffer follows the gain node, so the output has a low impedance and high current drive capability.
If you use MOSFETs, you not only have the problem of windup (as the limiting follower clamps the output, the op-amp goes on to the rail, so it takes time for it to come back down, driving the gates hard to get there, too), but crossover distortion and biasing, too. If done as shown, with enhancement mode FETs, there's a huge (>2V?) deadband where the op-amp output has no effect on the output. Normally a class B or C bipolar output stage (e.g., LM324) will have a < 0.3V step at this point, but now you're talking huge slop, and no matter how fast your amp is, it will impact performance!
And if you add the bias components, now you have the issue of bias current drawn from the clamp supplies, which will cause a nonzero error, due to current drawn from the buffers setting those voltages, as well as the Rds(on) limitation of the FETs. So they might not pull within mV anymore, and how much it's off by is dependent on supply voltages.
On a more subtle note, there's also feed-through from the op-amp output, via G-S capacitance. So the class C biasing will "anticipate" the op-amp recovery before it finally happens, causing a pre-shoot effect.
It's a good building block, but it needs quite a bit of elaboration to polish into a precision circuit. The foundation comes from RF mixer designs -- I suppose all of these do, ultimately: the basic idea is to use diodes and current steering, or saturated switches (usually FETs), to toggle an RF signal between "on" and "off" (two devices in series, single balanced mixer), or "plus" and "minus" (double balanced mixer, four devices in an H-bridge configuration). The overshoot and preshoot issues, in RF terms, manifest as excess reactance and poor LO/RF isolation.
Tim