Reviving this thread...
I'm considering implementing this in VHDL.
Has any of you already done this? Any tips? I've read about it, and the base algorithm seems simple enough, but properly implementing it in an HDL with pure integers, without losing precision, is not as trivial as it looks - my goal is to use it not to implement FP division, but integer division, so obviously the integer quotient needs to be exact.
A cool idea, if anyone's interested, is that we could share a working solution once we get to one. If I get there, I'll share it.