General > General Technical Chat
Odds seem too high when guessing random zeroes and ones
SiliconWizard:
--- Quote from: RoGeorge on April 01, 2023, 08:32:58 am ---I always thought modulo division is the proper way to bound the randomness of a big number to a given range.
Are you saying that if we divide a big pseudo-random number by n, the remainder is no longer a pseudo-random [0..n)? I don't understand, why/how the randomness would be lost?
--- End quote ---
This is a very common issue with PRNGs in general, especially the simpler ones. There's a lot of litterature about PRNGs that is as interesting as it can get hairy.
Now the only way of characterizing the properties of a PRNG is to use a range of statistical tests. A "classic" and relatively recent paper about it:
https://www.nist.gov/publications/statistical-test-suite-random-and-pseudorandom-number-generators-cryptographic
An example among many of the exact same question: https://stackoverflow.com/questions/13104478/uniformity-of-random-numbers-taken-modulo-n
As is quoted in the mentioned thread, the C standard itself even states (about the rand() function):
--- Quote ---Recommended practice
There are no guarantees as to the quality of the random sequence produced and some implementa-
tions are known to produce sequences with distressingly non-random low-order bits. Applications
with particular requirements should use a generator that is known to be sufficient for their needs.
--- End quote ---
Taking the LSB of a PRNG output directly, unless said PRNG is carefully selected for this particular property, is not going to be a good idea in general.
You can either look for a better PRNG for this use case and use the LSB (mod 2) as the output - can be a fun exercise as I said but possibly time-consuming - or, as I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB), with a range of even mediocre PRNGs.
Let me know how that works out.
Statistical properties of PRNGs is an interesting topic and a great rabbit hole in itself.
alm:
I think it's important to point out that there's a distinction between 'toy' PRNGs like C's rand() function, which you might use to pick a random background picture to show, and the cryptography grade PRNGs that you might use to generate an encryption key.
The former is likely to suffer from this, the latter probably much less so. An alternative might be running the output of a PRNG through a cryptographic hash function, which won't improve randomness but should distribute it evenly across the bits. But that may not be any faster / smaller than a better PRNG.
ebastler:
--- Quote from: SiliconWizard on April 02, 2023, 02:59:37 am ---[I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB),
--- End quote ---
Doesn't your proposed method boil down to just taking the MSB? Why would that be any better for simple pseudo-random number generators?
snarkysparky:
Improve the PRNG by undersampling it.
With a stream of original values.
Take the middle 4 bits from a returned value A = (V & 0b0000000111100000) >> 5;
Capture A more random words and on the A'th generated value B = (V & 0b0000000111100000) >> 5;
Capture B more random words and on the B'th generated value C = (V & 0b0000000111100000) >> 5;
Capture C more random words and on the C'th generated value D = (V & 0b0000000111100000) >> 5;
Out = ( A << 12) | ( B << 8 ) | (C << 4) | D
Or any of a nearly infinite number of ways to take some of the entropy from each returned value and combine them.
SiliconWizard:
--- Quote from: ebastler on April 02, 2023, 04:55:19 pm ---
--- Quote from: SiliconWizard on April 02, 2023, 02:59:37 am ---[I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB),
--- End quote ---
Doesn't your proposed method boil down to just taking the MSB? Why would that be any better for simple pseudo-random number generators?
--- End quote ---
Because as is common with "basic implementations" of PRGNs, as the C standard reminds us, the low-order bits are a lot more likely to have a bias than the high-order ones.
As I suggested, try it with the Arduino PRNG and see what you get.
Of course the better approach would be to use a PRNG better suited to this particular goal - that's interesting but is a rabbit hole. So depends on how much time and interest you have in the matter.
One reasonable approach is to try one of the now commonly-used (but still rarely in standard C libraries) Xoroshiro PRNGs. But not all of them, so look up the ones that are most appropriate.
For instance: avoid Xoroshiro128+, they clearly state: "The lowest bits of the output generated by xoroshiro128+ have low quality. The authors of xoroshiro128+ acknowledge that it does not pass all statistical tests, "
Oh, and, if you are curious about what kind of PRNG you get using rand() from the std C library, in particular on an embedded target, so most likely using newlib, here it is:
https://github.com/devkitPro/newlib/blob/master/newlib/libc/stdlib/rand.c
I don't know for sure what the random function in Arduino is based on. I guess it just wraps this, but I'm not sure.
But for those using rand() in C or C++ on practically any MCU these days, that's what you'll get.
It's one of the most basic PRNGs you can get. Does the job for simple stuff but wouldn't pass most of the statistical tests common for characterizing PRNGs.
It's really as barebones as it gets, and getting poor-quality low-order bits with this is no surprise.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version