In case you're still interested, here's a small script to factor any divisor.

`#! /usr/bin/env python3`

import math as m

import sys

WANT=(int)(sys.argv[1])

first=round(m.log(WANT,2))

error=1./WANT - 1./(2**first)

terms=[first]

while m.fabs(error) > 1./(1<<63):

sign = m.copysign(1, error)

term = -round(m.log(m.fabs(error), 2.))

terms.append(sign*term)

error -= sign*(2**(-term))

print(terms)

And if you run it for 1000,

`[10, 15.0, -17.0, 21.0, 24.0, 26.0, -29.0, -33.0, -34.0, 36.0, -38.0, -42.0, 45.0, -48.0, -50.0, -51.0, 53.0, -60.0]`

This says

`x/1000 ~= (x >> 10)`

+ (x >> 15)

- (x >> 17)

+ (x >> 21)

+ (x >> 24)

+ ...

Obviously, for a 16-bit integer, terms past 1>>15 don't matter. For 32-bit integers, the largest term is -(x >> 29).

A cursory check shows this yields a 1-bit error for values 1000-65535. Noticeable though is that 1000-1023 yield 0 for 16-bit integers.

(BTW, it's probably not accurate to the full 64 bits due to lack of precision in the 64-bit floating point error variable. I think it's something like 53 bits.)

Edit: here's a variant to output a C++ function for up to 32 bit integers.

`#! /usr/bin/env python3`

import math as m

import sys

WANT=(int)(sys.argv[1])

first=round(m.log(WANT,2))

error=1./WANT - 1./(2**first)

terms=[first]

print("template <typename T>")

print("static inline T div_%s(const T& val) {" % WANT)

print(" return (val >> %s)" % first, end='')

while m.fabs(error) > 1./(1<<32):

sign = m.copysign(1, error)

term = -round(m.log(m.fabs(error), 2.))

print(" %s (val >> %s)" % ("-" if sign == -1.0 else "+", (int)(term)), end='')

error -= sign*(2**(-term))

print(";\n}")

Which produces:

`template <typename T>`

static inline T div_1000(const T& val) {

return (val >> 10) + (val >> 15) - (val >> 17) + (val >> 21) + (val >> 24) + (val >> 26) - (val >> 29);

}

A #pragma to suppress warnings about integer shifts always being zero might be useful for integers smaller than uint32_t.