IEEE 754 Binary32
float storage representations:
0x00000000 = +0.0f
0x00000000 .. 0x007FFFFF = positive subnormals, 2⁻¹⁴⁹ .. 8388607×2⁻¹⁴⁹, inclusive; < 2⁻¹²⁶
0x00800000 = 2⁻¹²⁶, smallest positive normal
:0x3F800000 = +1.0f
0x4B800000 = +16777216.0f, largest integer in the consecutive range
0x4B800001 = +16777218.0f
0x7F7FFFFF = 16777215×2¹⁰⁴ < 2¹²⁸, largest positive value representable
0x7F80000 = +Infinity
0x7F80001 .. 0x7FFFFFFF = +NaN
0x80000000 = -0.0f
0x80000000 .. 0x807FFFFF = negative subnormals, -2⁻¹⁴⁹ .. -8388607×2⁻¹⁴⁹, inclusive; > -2⁻¹²⁶
0x80800000 = -2⁻¹²⁶, largest negative normal
:0xBF800000 = -1.0f
0xCB800000 = -16777216.0f, smallest integer in the consecutive range
0xFF7FFFFF = -16777215×2¹⁰⁴ > -2¹²⁸, smallest positive value representable
0xFF800000 = -Infinity
0xFF800001 .. 0xFFFFFFFF = -NaN
Zero is the only repeated numeric value, and only because +0.0f and -0.0f have their own separate representations.
There is exactly one positive infinity and one negative infinity, but 8388607 positive NaNs and 8388607 negative NaNs, and thus 16777214 NaN representations. Because any arithmetic operation with a NaN yields NaN (but retains the sign of the operation), you have millions of non-numeric "marker" values you can use. For example, a stack-based single-precision floating-point calculator might use the NaNs to describe the operators (say positive NaNs), and functions (say negative NaNs), and thus only need a single stack of
float values.
On architectures like x86, x86-64, any ARM with hardware floating-point support, any Linux architecture with hardware or software floating-point support, you can use the
float representation examination tool I posted in
reply #37 to explore and check these; I use it constantly when working with
floats.
If you want to add a check for architecture support to it, I'd add
if (sizeof (float) != sizeof (uint32_t) || ((word32){ .u = 0x00000000 }).f != ((word32){ .u = 0x80000000 }).f || ((word32){ .u = 0xC0490FDB }).f != 3.1415927410125732421875f || ((word32){ .u = 0x4B3C614E }).f != 12345678.0f) { fprintf(stderr, "Unsupported 'float' type: not IEEE 754 Binary32.\n"); exit(EXIT_FAILURE); }to the beginning of
main(). I do not normally bother, because I haven't had access to a system where that would trigger, in decades. That includes all my SBCs and microcontrollers. (Notably, I do not have any DSPs, which are the exception, and actually could still use a non-IEEE 754 Binary32
float type.)
The function in
reply #50 yields the difference between any two non-NaN
float values, in number of unique non-NaN
float representations between the two. If they are the same value, it returns 0; if they are consecutive representable numeric
float values, it returns 1; if there is one representable numeric
float between the two values, it returns 2, and so on. For example, for the difference between the smallest positive subnormal and the largest negative subnormal (storage representations 0x00000001 and 0x80000001) it returns 3.
The key idea in that difference is that negative
float values representation is subtracted from 0x80000000, which turns their storage representation to the integer negative of the storage representation of the corresponding positive
float value. Note that this also maps -0.0f to the same integer representation as +0.0f does, 0x00000000. The difference of such modified signed integer storage representations is the difference in number of representable float values as described in the previous paragraph.
For radix sorting, the storage representation is kept as a 32-bit unsigned integer. For representations that have the sign bit clear, you set the most significant bit. For all other representations, you invert all bits (~). This keeps the representations of +0.0f and -0.0f separate (0x80000000 and 0x7FFFFFFF, respectively), puts the positive
floats representation above the negative ones, and inverts the order of the representations of the negative ones, essentially ensuring that
all non-NaN
float values order the same way as their unsigned 32-bit modified storage representations.
After radix sorting, the inverse operation needs to invert all bits (~) if the most significant bit is clear, and only clear the most significant bit if it is set. This undoes the storage representation modification, restoring the exact original
float representations.
I, too, use fixed point types quite often. However, this thread is about the specific case when you have
hardware float support, and want to leverage that instead. So, commenting that one should use fixed-point here instead of hardfloat is an apples-to-oranges replacement suggestion.