So anyway, if you want to represent exactly -1 and +1 (assuming that you really need that), you'd use Q1.14 that would mean 1 integer bit, 14 fractional bits, and an implied sign bit.
We're digressing a bit, but anyway. The Q notation is always a bit ambiguous and there are actually two conventions, one including the sign bit and the other, not. If you know the word length is 16-bit, Q1.15 usually means that the leading 1 bit is just the sign bit (as you assumed), but that's confusing as it's meant to be the integer part, so definitions vary a bit even from one vendor/textbook to the other. As long as you know what is talked about exactly, it's fine...
So anyway, if you want to represent exactly -1 and +1 (assuming that you really need that), you'd use Q1.14 that would mean 1 integer bit, 14 fractional bits, and an implied sign bit.
I've often used unsigned fixed point. A whole range of calculations don't need negative numbers. Some definitions would write what I meant as: UQ1.15 ('U' as in unsigned).
Regarding unsigned fractional, UQm.n, it is like so:
UQ1.15 means from 0.00000 up to 1.999something.
I think wikipedia is quite clear about this: https://en.wikipedia.org/wiki/Q_(number_format)
So anyway, if you want to represent exactly -1 and +1 (assuming that you really need that), you'd use Q1.14 that would mean 1 integer bit, 14 fractional bits, and an implied sign bit.
However that implies you have a 15bit register. That is a pretty nonstandard use of the notation and I would strongly suggest against using it.
here are two conflicting notations for fixed point. Both notations are written as Qm.n, where:
Q designates that the number is in the Q format notation – the Texas Instruments representation for signed fixed-point numbers (the "Q" being reminiscent of the standard symbol for the set of rational numbers).
m. (optional, assumed to be zero or one) is the number of bits set aside to designate the two's complement integer portion of the number, exclusive or inclusive of the sign bit (therefore if m is not specified it is taken as zero or one).
n is the number of bits used to designate the fractional portion of the number, i.e. the number of bits to the right of the binary point. (If n = 0, the Q numbers are integers – the degenerate case).
One convention includes the sign bit in the value of m,[1][2] and the other convention does not. The choice of convention can be determined by summing m+n. If the value is equal to the register size, then the sign bit is included in the value of m. If it is one less than the register size, the sign bit is not included in the value of m.
So anyway, if you want to represent exactly -1 and +1 (assuming that you really need that), you'd use Q1.14 that would mean 1 integer bit, 14 fractional bits, and an implied sign bit.
However that implies you have a 15bit register. That is a pretty nonstandard use of the notation and I would strongly suggest against using it.
Nope. Read again the very article you linked to.Code: [Select]here are two conflicting notations for fixed point. Both notations are written as Qm.n, where:
Q designates that the number is in the Q format notation – the Texas Instruments representation for signed fixed-point numbers (the "Q" being reminiscent of the standard symbol for the set of rational numbers).
m. (optional, assumed to be zero or one) is the number of bits set aside to designate the two's complement integer portion of the number, exclusive or inclusive of the sign bit (therefore if m is not specified it is taken as zero or one).
n is the number of bits used to designate the fractional portion of the number, i.e. the number of bits to the right of the binary point. (If n = 0, the Q numbers are integers – the degenerate case).
One convention includes the sign bit in the value of m,[1][2] and the other convention does not. The choice of convention can be determined by summing m+n. If the value is equal to the register size, then the sign bit is included in the value of m. If it is one less than the register size, the sign bit is not included in the value of m.
And I just do prefer the second convention that doesn't include the sign bit in "m", as I find it more consistent. I know the other notation is also common.
Assembler? That is what the intrinsics are for. Perfectly fine to be used within the C environment. Even the CMSIS DSP library is written completely in C, using the intrinsics.
(I would not call that an assembly language).
The only reason I can think for the 16 bit float is that you can reduce expensive store/load operations if 32 bit is unnecessary.
From what I read, an important usage of FP16 is for applications that needs high dynamic range, but not very high SNR. Speech recognization as an example, the audio samples can have huge dynamic range, but recognizing the sound may only need a moderate SNR. Either you speak right next to the microphone or 10 feet away, algorithms implemented with FP16 works well in both cases.
I see, but FP16 has only a 5-bit exponent... the dynamic range will still not be very "high".
As to dynamic range vs. SNR, that would be an interesting discussion.Someone may argue that a good AGC plus VGA for the microphone can get the work done too. But obviously that's harder to get right and much less appealing in this digital world.
Yes that too. Well if you're relying on FP operations, you're still in the digital domain. So if it can be done with FP, it can be done with a purely digital AGC with fixed point (and get you better SNR).
But yeah it again all comes down to keeping things very simple, and likely relying on ready-made FPUs as I hinted above.
From what I read, an important usage of FP16 is for applications that needs high dynamic range, but not very high SNR. Speech recognization as an example, the audio samples can have huge dynamic range, but recognizing the sound may only need a moderate SNR. Either you speak right next to the microphone or 10 feet away, algorithms implemented with FP16 works well in both cases.
I see, but FP16 has only a 5-bit exponent... the dynamic range will still not be very "high".
As to dynamic range vs. SNR, that would be an interesting discussion.Someone may argue that a good AGC plus VGA for the microphone can get the work done too. But obviously that's harder to get right and much less appealing in this digital world.
Yes that too. Well if you're relying on FP operations, you're still in the digital domain. So if it can be done with FP, it can be done with a purely digital AGC with fixed point (and get you better SNR).
But yeah it again all comes down to keeping things very simple, and likely relying on ready-made FPUs as I hinted above.
You are mis-understanding the targeted application area -- neural networks.
The trained networks are very fuzzy, in the sense of being imprecise. No element or weight has been studied and calculated to have the correct value, or even a reasonable value. Instead they just run the training until the wrong results are reduced in influence enough that a not-obviously-wrong output is produced. Each element contributes a little bit of the influence for the result, and each intermediate response is weighted.
Like in the real world, some voices get a weight of 0.0000000001 and others get a weight of 0.99. With enough inputs, it doesn't matter if a particular weight is 0.51 or 0.52. But you don't want the 0.000000001 rounded up to 0.01 or forgotten completely.
The point here is that, while superficially similar to SNR, it's quite different in detail. In particular the concept of AGC doesn't apply because you simultaneously want to add in very tiny contributions while being heavily influenced by the strong weightings.
From what I read, an important usage of FP16 is for applications that needs high dynamic range, but not very high SNR. Speech recognization as an example, the audio samples can have huge dynamic range, but recognizing the sound may only need a moderate SNR. Either you speak right next to the microphone or 10 feet away, algorithms implemented with FP16 works well in both cases.
I see, but FP16 has only a 5-bit exponent... the dynamic range will still not be very "high".
As to dynamic range vs. SNR, that would be an interesting discussion.Someone may argue that a good AGC plus VGA for the microphone can get the work done too. But obviously that's harder to get right and much less appealing in this digital world.
Yes that too. Well if you're relying on FP operations, you're still in the digital domain. So if it can be done with FP, it can be done with a purely digital AGC with fixed point (and get you better SNR).
But yeah it again all comes down to keeping things very simple, and likely relying on ready-made FPUs as I hinted above.
You are mis-understanding the targeted application area -- neural networks.
The trained networks are very fuzzy, in the sense of being imprecise. No element or weight has been studied and calculated to have the correct value, or even a reasonable value. Instead they just run the training until the wrong results are reduced in influence enough that a not-obviously-wrong output is produced. Each element contributes a little bit of the influence for the result, and each intermediate response is weighted.
Like in the real world, some voices get a weight of 0.0000000001 and others get a weight of 0.99. With enough inputs, it doesn't matter if a particular weight is 0.51 or 0.52. But you don't want the 0.000000001 rounded up to 0.01 or forgotten completely.
The point here is that, while superficially similar to SNR, it's quite different in detail. In particular the concept of AGC doesn't apply because you simultaneously want to add in very tiny contributions while being heavily influenced by the strong weightings.
I don’t get it.
Who is really doing AI / ML at the far edge on cortex processors? I’m sure someone is, but really how many people could this really apply to?
See for example the Kendryte K210 chip
QuoteSee for example the Kendryte K210 chipIs there a complete Datasheet and user-manual for that thing, what about tool-chains and tutorials for it, it seems very interesting but docs and info are lacking, or at least I could not find them
Who is really doing AI / ML at the far edge on cortex processors? I’m sure someone is, but really how many people could this really apply to?
I can see how FP16 would allow to represent smaller values in the [0, 1] interval compared to fixed point of similar width. What I'm still not convinced about is the real benefit over fixed point. I'd like to see comparative examples which clearly show that FP16 would yield better results overall. Thing for instance is, when at some point you have numbers that are so small that they would be represented as 0 with fixed point, and some value (but with low precision) with FP16... but would that contribute to the overall weighted sum enough to matter? (Given that even big NNs are usually limited in the number of "neurons", and as I understood, we tend to favor more layers these days rather than more neurons at each layer level...)
To sum it up, I would like to see a real comparison between FP16 and fixed point that would clearly show the benefits in real cases.
Maybe in the end, FP16 takes less resources overall than fixed point, given that even though fixed point itself is less expensive to implement, more care is needed for the calculations using that, so possibly in the end fixed point would be more expensive. I'm just not quite sure or convinced at this point, and would be interested in reading papers about that specifically if there are any.
It seems really strange to have ML in a processor targeting embedded, battery powered devices.
It seems really strange to have ML in a processor targeting embedded, battery powered devices.
There is a massive difference between training the NN and applying the results of the training.
From what I read, an important usage of FP16 is for applications that needs high dynamic range, but not very high SNR. Speech recognization as an example, the audio samples can have huge dynamic range, but recognizing the sound may only need a moderate SNR. Either you speak right next to the microphone or 10 feet away, algorithms implemented with FP16 works well in both cases.
I see, but FP16 has only a 5-bit exponent... the dynamic range will still not be very "high".
As to dynamic range vs. SNR, that would be an interesting discussion.Someone may argue that a good AGC plus VGA for the microphone can get the work done too. But obviously that's harder to get right and much less appealing in this digital world.
Yes that too. Well if you're relying on FP operations, you're still in the digital domain. So if it can be done with FP, it can be done with a purely digital AGC with fixed point (and get you better SNR).
But yeah it again all comes down to keeping things very simple, and likely relying on ready-made FPUs as I hinted above.
You are mis-understanding the targeted application area -- neural networks.
The trained networks are very fuzzy, in the sense of being imprecise. No element or weight has been studied and calculated to have the correct value, or even a reasonable value. Instead they just run the training until the wrong results are reduced in influence enough that a not-obviously-wrong output is produced. Each element contributes a little bit of the influence for the result, and each intermediate response is weighted.
Like in the real world, some voices get a weight of 0.0000000001 and others get a weight of 0.99. With enough inputs, it doesn't matter if a particular weight is 0.51 or 0.52. But you don't want the 0.000000001 rounded up to 0.01 or forgotten completely.
The point here is that, while superficially similar to SNR, it's quite different in detail. In particular the concept of AGC doesn't apply because you simultaneously want to add in very tiny contributions while being heavily influenced by the strong weightings.
Well, I understand your explanation. But a lot of ML algorithms use 8bit fix point. What's the difference here? What kind of ML algo needs floating point, what kind of ML algo only requires 8bit fixed point?