Author Topic: Odds seem too high when guessing random zeroes and ones  (Read 2723 times)

0 Members and 1 Guest are viewing this topic.

Offline RoGeorgeTopic starter

  • Super Contributor
  • ***
  • Posts: 7012
  • Country: ro
Odds seem too high when guessing random zeroes and ones
« on: March 19, 2023, 06:10:20 pm »
I've made a trivial Arduino demo that generates pseudorandom 0/1 bits, and lights up a different LED for each (the onboard LED and an extra LED connected at pin 12).

Code: [Select]
#define EXTRA_LED 12
#define BOARD_LED LED_BUILTIN

long randNr;

void setup() {
  // put your setup code here, to run once:
  Serial.begin(9600);
  randomSeed(analogRead(0));

  pinMode(BOARD_LED, OUTPUT);
  pinMode(EXTRA_LED, OUTPUT);
}

void loop() {
  // put your main code here, to run repeatedly:

  // blip, blip, blip, random draw
  digitalWrite(BOARD_LED, 0);
  digitalWrite(EXTRA_LED, 0);
  delay(500);
  for (unsigned char c=0; c < 3; c++) {
    digitalWrite(BOARD_LED, 0);
    digitalWrite(EXTRA_LED, 0);
    delay(190);
    digitalWrite(BOARD_LED, 1);
    digitalWrite(EXTRA_LED, 1);
    delay(10);
  }
  digitalWrite(BOARD_LED, 0);
  digitalWrite(EXTRA_LED, 0);
  delay(1000);

  randNr = random(2);
  switch (randNr) {
    case 0:
      digitalWrite(BOARD_LED, 0);
      digitalWrite(EXTRA_LED, 1);
      break;
    case 1:
      digitalWrite(BOARD_LED, 1);
      digitalWrite(EXTRA_LED, 0);
      break;
  }
  delay(1500);
}

The strange thing is that when guessing in advance which LED will light up next, chances of guessing seems too high.  Happens to guess correctly 3-4 times in a row, once it was 5 or 6 consecutive good guesses.  Only once or twice guessed wrong from the first try.  The chances to fail from the first guess should be 50%.  :-//

Each time only played until the first wrong guess, then forget about it.  I've only tried to guess it about 10-20 times or so today, playing only occasionally, each time when the blinking LED happen to distract by accident (it's running non stop on a shelf in the lab). The number of drawings so far is probably too small to be representative, but still, very uncanny odds.  ???

I suspect it's some psychological effect, where the odds seems much better than they really are.
Does it make any sense what I'm describing here, anybody else ever noticed something similar?
« Last Edit: March 19, 2023, 06:33:58 pm by RoGeorge »
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11905
  • Country: us
    • Personal site
Re: Odds seem too high when guessing random zeroes and ones
« Reply #1 on: March 19, 2023, 06:26:41 pm »
This is a very known thing. For short runs random sequences may not "look" random. Guess 100 times and see how you do. But write down the results so it is not skewed by memory.

There are videos where people working with statistics try to distinguish sequences coming out of the true RNG and generated by a human to look random. In most cases the biggest tell is long runs of the same outcome, which happens in true randomness a lot, but humans can't bring themselves to do that. "You can't have 7 tails in a row, that's not random".
« Last Edit: March 19, 2023, 06:30:33 pm by ataradov »
Alex
 
The following users thanked this post: tom66, RJSV

Offline iMo

  • Super Contributor
  • ***
  • Posts: 5570
  • Country: va
Re: Odds seem too high when guessing random zeroes and ones
« Reply #2 on: March 19, 2023, 06:45:37 pm »
..I suspect it's some psychological effect, where the odds seems much better than they really are.
Does it make any sense what I'm describing here, anybody else ever noticed something similar?
Sure, it does.. When somebody works with the arduino for too long, he/she starts to think the same way..  >:D
Readers discretion is advised..
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9336
  • Country: fi
Re: Odds seem too high when guessing random zeroes and ones
« Reply #3 on: March 19, 2023, 07:14:52 pm »
There are videos where people working with statistics try to distinguish sequences coming out of the true RNG and generated by a human to look random. In most cases the biggest tell is long runs of the same outcome, which happens in true randomness a lot, but humans can't bring themselves to do that. "You can't have 7 tails in a row, that's not random".

Yeah. We are generating random 32-bit device IDs represented as 8-digit hexadecimals (like A05D31DE). And if you look at these device IDs, more than half of them give you bad vibes, "there must be something wrong with our random number generator". Some have exclusively letters A-F with no digits. Others have numbers only. Some have words like BAD in them. Nearly all seem to have some kind of pattern.

This is because human brain is hardwired to detect patterns. Randomness will have patterns, too - just random patterns.
 
The following users thanked this post: thm_w, tooki, newbrain

Online Bud

  • Super Contributor
  • ***
  • Posts: 7276
  • Country: ca
Re: Odds seem too high when guessing random zeroes and ones
« Reply #4 on: March 19, 2023, 07:58:20 pm »
@imo Gee mate, you really gone wild lately with your country flags. Made me Google for Tokelau today  :D
Facebook-free life and Rigol-free shack.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15800
  • Country: fr
Re: Odds seem too high when guessing random zeroes and ones
« Reply #5 on: March 19, 2023, 08:22:23 pm »
This is a very known thing. For short runs random sequences may not "look" random. Guess 100 times and see how you do. But write down the results so it is not skewed by memory.

There are videos where people working with statistics try to distinguish sequences coming out of the true RNG and generated by a human to look random. In most cases the biggest tell is long runs of the same outcome, which happens in true randomness a lot, but humans can't bring themselves to do that. "You can't have 7 tails in a row, that's not random".

That's for sure, but first thing before resorting to psychology would be to check the quality of the PRNG used in this particular case. And not enough "randomness" in short-term sequences is also a known flaw in some PRNGs (so, something perfectly objective), or at least something that may not be desirable in a given application, and certainly some PRNGs are better than others.

I wouldn't bet much on the default Arduino PRNG.

Oh, and side note is, I don't know exactly what random(2) does behind the scenes. But if say, it just truncates a PRN to its LSB, except for very specific PRNGs, the end-result is likely to be horrible in terms of distribution. The best bet is usually to take the "native" output of the PRNG and compare it with  the middle value to get a single bit of output. Truncating will (with few exceptions for specifically-tailored PRNGs) yield pretty bad sequences.
« Last Edit: March 19, 2023, 08:27:37 pm by SiliconWizard »
 

Offline jpanhalt

  • Super Contributor
  • ***
  • Posts: 4005
  • Country: us
Re: Odds seem too high when guessing random zeroes and ones
« Reply #6 on: March 19, 2023, 08:25:46 pm »
1) How many LED's do you have?
2) Could this be remotely related to Benford's law?
3) What algorithm does the pseudo random generator use?
 

Offline mendip_discovery

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: gb
Re: Odds seem too high when guessing random zeroes and ones
« Reply #7 on: March 19, 2023, 08:57:43 pm »
I remember doing some stuff on the Amiga and a 486. They really hated random numbers, often picking just he next number in the list.

I remember watching somthing about cloudflare and how they have a room full of larva lamps and film them to get some random inputs that they then feed into the system.

Could you take a random word, hash it a random amount of times and then pick a random number from that?
Motorcyclist, Nerd, and I work in a Calibration Lab :-)
--
So everyone is clear, Calibration = Taking Measurement against a known source, Verification = Checking Calibration against Specification, Adjustment = Adjusting the unit to be within specifications.
 

Offline RoGeorgeTopic starter

  • Super Contributor
  • ***
  • Posts: 7012
  • Country: ro
Re: Odds seem too high when guessing random zeroes and ones
« Reply #8 on: March 19, 2023, 09:06:45 pm »
1) How many LED's do you have?
2) Could this be remotely related to Benford's law?
3) What algorithm does the pseudo random generator use?

2 LEDs.  After a random bit was draw, they lit either on/off or off/on.
- Random bit was 0, then make led1=off, led2=on
- Random bit was 1, then make led1=on, led2=off

I don't know what algorithm is used in the pseudorandom generator, but I guess it's all working OK.  Random(2) will return either 0 or 1 with 50% chances each:  https://reference.arduino.cc/reference/en/language/functions/random-numbers/random/



Tried to play some more games while writing down the results, so to get some numbers instead of just an impression.

I think I was cherry-picking badly (without noticing), for example by discarding many consecutive wrong guesses as a single "bad start", when in fact those consecutive wrong guesses were not one bad start (a single game), bud many distinguished failed games, just that they happen to come all in a row.

Offline eugene

  • Frequent Contributor
  • **
  • Posts: 497
  • Country: us
Re: Odds seem too high when guessing random zeroes and ones
« Reply #9 on: March 31, 2023, 03:50:01 pm »
Just for reference, when compiling for the Arduino Uno (or just about any AVR based Arduino-like board) AVR-GCC is used along with AVR-LIBC. If you install support for other, non-AVR boards then another compiler will be installed, typically some version of gcc.

Reference for the use of AVR-LIBC can be found here. Specifically, the stdlib library reference is here. Both random() and rand() are in there.

Source code for stdlib that is part of AVR-GCC is here.
90% of quoted statistics are fictional
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15800
  • Country: fr
Re: Odds seem too high when guessing random zeroes and ones
« Reply #10 on: March 31, 2023, 09:15:47 pm »
As I mentioned, PRNGs are not trivial things.

Even if your base PRNG is decent, once you multiply/divide its output by a given ratio to change the scale of the output, you're going to screw up the characteristics of the PRNG  more or less drastically depending on the PRNG and the scaling ratio. If the scaling ratio is very small, like in this case (we can assume that the base PRNG has a native 32-bit width), scaling it down to [0, 1] will probably make the characteristics of the ouput pretty ugly on a short term. Sure it's more or  less going to be 50% on average for a very long sequence, but it's very likely to have relatively 'long' same-bit patterns on short term.

I would highly suggest rolling your own 1-bit PRNG for this. Additionally, it's a fun exercise.
 

Offline RoGeorgeTopic starter

  • Super Contributor
  • ***
  • Posts: 7012
  • Country: ro
Re: Odds seem too high when guessing random zeroes and ones
« Reply #11 on: April 01, 2023, 08:32:58 am »
I always thought modulo division is the proper way to bound the randomness of a big number to a given range.

Are you saying that if we divide a big pseudo-random number by n, the remainder is no longer a pseudo-random [0..n)?  I don't understand, why/how the randomness would be lost?
 
The following users thanked this post: Karel

Offline Psi

  • Super Contributor
  • ***
  • Posts: 10385
  • Country: nz
Re: Odds seem too high when guessing random zeroes and ones
« Reply #12 on: April 01, 2023, 08:41:06 am »
In many cases where you need randomness for visual stuff, you're better to have a block of memory/eeprom where you hard code a chunk of random data. Then you can step through the data by 1, then by 2, then by 3 etc.. to create a much larger block of random data.  You can pre-filter the data to ensure there's no long blocks of the same numbers, which can happen in true random data but may be undesirable in some applications.
You can even use a psudo random number generator to index your block of custom random data etc..
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 2275
  • Country: 00
Re: Odds seem too high when guessing random zeroes and ones
« Reply #13 on: April 01, 2023, 08:53:23 am »
I always thought modulo division is the proper way to bound the randomness of a big number to a given range.

Are you saying that if we divide a big pseudo-random number by n, the remainder is no longer a pseudo-random [0..n)?  I don't understand, why/how the randomness would be lost?

I would like to know that as well. Any experts here willing to comment on that?
 

Offline alm

  • Super Contributor
  • ***
  • Posts: 2903
  • Country: 00
Re: Odds seem too high when guessing random zeroes and ones
« Reply #14 on: April 01, 2023, 11:26:56 am »
Even if your base PRNG is decent, once you multiply/divide its output by a given ratio to change the scale of the output, you're going to screw up the characteristics of the PRNG  more or less drastically depending on the PRNG and the scaling ratio. If the scaling ratio is very small, like in this case (we can assume that the base PRNG has a native 32-bit width), scaling it down to [0, 1] will probably make the characteristics of the ouput pretty ugly on a short term. Sure it's more or  less going to be 50% on average for a very long sequence, but it's very likely to have relatively 'long' same-bit patterns on short term.
I'd be curious to see a reference about this. As far as I know a PRNG just generates a number of bits of randomness, and has no knowledge about what these bits mean. Obviously you might lose resolution if you multiply / divide an integer by a large number (like shift right 16 bits), but that is nothing specific to PRNGs.

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15800
  • Country: fr
Re: Odds seem too high when guessing random zeroes and ones
« Reply #15 on: April 02, 2023, 02:59:37 am »
I always thought modulo division is the proper way to bound the randomness of a big number to a given range.

Are you saying that if we divide a big pseudo-random number by n, the remainder is no longer a pseudo-random [0..n)?  I don't understand, why/how the randomness would be lost?

This is a very common issue with PRNGs in general, especially the simpler ones. There's a lot of litterature about PRNGs that is as interesting as it can get hairy.
Now the only way of characterizing the properties of a PRNG is to use a range of statistical tests. A "classic" and relatively recent paper about it:
https://www.nist.gov/publications/statistical-test-suite-random-and-pseudorandom-number-generators-cryptographic

An example among many of the exact same question: https://stackoverflow.com/questions/13104478/uniformity-of-random-numbers-taken-modulo-n

As is quoted in the mentioned thread, the C standard itself even states (about the rand() function):
Quote
Recommended practice
There are no guarantees as to the quality of the random sequence produced and some implementa-
tions are known to produce sequences with distressingly non-random low-order bits
. Applications
with particular requirements should use a generator that is known to be sufficient for their needs.

Taking the LSB of a PRNG output directly, unless said PRNG is carefully selected for this particular property, is not going to be a good idea in general.

You can either look for a better PRNG for this use case and use the LSB (mod 2) as the output - can be a fun exercise as I said but possibly time-consuming - or, as I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB), with a range of even mediocre PRNGs.
Let me know how that works out.

Statistical properties of PRNGs is an interesting topic and a great rabbit hole in itself.
 
The following users thanked this post: RoGeorge

Offline alm

  • Super Contributor
  • ***
  • Posts: 2903
  • Country: 00
Re: Odds seem too high when guessing random zeroes and ones
« Reply #16 on: April 02, 2023, 07:28:22 am »
I think it's important to point out that there's a distinction between 'toy' PRNGs like C's rand() function, which you might use to pick a random background picture to show, and the cryptography grade PRNGs that you might use to generate an encryption key.

The former is likely to suffer from this, the latter probably much less so. An alternative might be running the output of a PRNG through a cryptographic hash function, which won't improve randomness but should distribute it evenly across the bits. But that may not be any faster / smaller than a better PRNG.

Offline ebastler

  • Super Contributor
  • ***
  • Posts: 7375
  • Country: de
Re: Odds seem too high when guessing random zeroes and ones
« Reply #17 on: April 02, 2023, 04:55:19 pm »
[I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB),

Doesn't your proposed method boil down to just taking the MSB? Why would that be any better for simple pseudo-random number generators?
 

Offline snarkysparky

  • Frequent Contributor
  • **
  • Posts: 419
  • Country: us
Re: Odds seem too high when guessing random zeroes and ones
« Reply #18 on: April 02, 2023, 05:21:49 pm »
Improve the PRNG by undersampling it.

With a stream of original values.

Take the middle 4 bits from a returned value                                   A =  (V & 0b0000000111100000) >> 5;

Capture  A more random words   and on the A'th  generated value    B =  (V & 0b0000000111100000) >> 5;

Capture  B more random words   and on the B'th  generated value    C =  (V & 0b0000000111100000) >> 5;

Capture  C more random words   and on the C'th  generated value    D =  (V & 0b0000000111100000) >> 5;

Out  =   ( A << 12) |  ( B << 8 )   |  (C << 4)  |  D

Or any of a nearly infinite number of ways to take some of the entropy from each returned value and combine them.








« Last Edit: April 02, 2023, 05:32:06 pm by snarkysparky »
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15800
  • Country: fr
Re: Odds seem too high when guessing random zeroes and ones
« Reply #19 on: April 05, 2023, 08:03:40 pm »
[I suggested earlier, just use the base Random function at your disposal here but with the highest range it supports (check with the implementation, I suspect it is 2^32-1 ?), and compare its output with the middle value (2^31) to get your output bit. Should be distributed in a significantly more "random" manner than taking the mod 2 (so just taking the LSB),

Doesn't your proposed method boil down to just taking the MSB? Why would that be any better for simple pseudo-random number generators?

Because as is common with "basic implementations" of PRGNs, as the C standard reminds us, the low-order bits are a lot more likely to have a bias than the high-order ones.
As I suggested, try it with the Arduino PRNG and see what you get.

Of course the better approach would be to use a PRNG better suited to this particular goal - that's interesting but is a rabbit hole. So depends on how much time and interest you have in the matter.
One reasonable approach is to try one of the now commonly-used (but still rarely in standard C libraries) Xoroshiro PRNGs. But not all of them, so look up the ones that are most appropriate.
For instance: avoid Xoroshiro128+, they clearly state: "The lowest bits of the output generated by xoroshiro128+ have low quality. The authors of xoroshiro128+ acknowledge that it does not pass all statistical tests, "

Oh, and, if you are curious about what kind of PRNG you get using rand() from the std C library, in particular on an embedded target, so most likely using newlib, here it is:

https://github.com/devkitPro/newlib/blob/master/newlib/libc/stdlib/rand.c

I don't know for sure what the random function in Arduino is based on. I guess it just wraps this, but I'm not sure.
But for those using rand() in C or C++ on practically any MCU these days, that's what you'll get.

It's one of the most basic PRNGs you can get. Does the job for simple stuff but wouldn't pass most of the statistical tests common for characterizing PRNGs.
It's really as barebones as it gets, and getting poor-quality low-order bits with this is no surprise.

« Last Edit: April 05, 2023, 08:20:57 pm by SiliconWizard »
 
The following users thanked this post: ebastler

Offline bookaboo

  • Frequent Contributor
  • **
  • Posts: 763
  • Country: ie
Re: Odds seem too high when guessing random zeroes and ones
« Reply #20 on: April 05, 2023, 08:43:05 pm »
It could be urban myth but the story is entirely plausible that in early generations of iPods Apple used a (for all intents and purposes) true RNG algorithm to choose songs. People complained that it wasn't random enough because it would play certain bands more than others, or in too long a consecutive sequence.
They modified it to be less (mathematically) random using weighting and balancing..... everybody happy.

« Last Edit: April 05, 2023, 08:46:39 pm by bookaboo »
 

Offline bookaboo

  • Frequent Contributor
  • **
  • Posts: 763
  • Country: ie
Re: Odds seem too high when guessing random zeroes and ones
« Reply #21 on: April 05, 2023, 08:46:18 pm »
Would be easy enough to check the arduino RNG? Run a counter on that program for a few million iterations (without the delays obviously).
Add some variants and there's a YouTube video for someone to make.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15800
  • Country: fr
Re: Odds seem too high when guessing random zeroes and ones
« Reply #22 on: April 05, 2023, 08:55:30 pm »
As to the random() function in Arduino, if the target is an AVR, then it's not from newlib, but from avr-libc, and the implementation can be seen here:

https://github.com/avrdudes/avr-libc/blob/main/libc/stdlib/random.c

It's not any better than the newlib one. It's similar, but actually probably slightly worse.

 

Offline AndyBeez

  • Frequent Contributor
  • **
  • Posts: 858
  • Country: nu
Re: Odds seem too high when guessing random zeroes and ones
« Reply #23 on: April 05, 2023, 09:11:00 pm »
To prove the randomness of any random number generator, whether it is on a pocket calculator, a crypto cypher generator, or a national lottery supercomputer, you need to calculate the Standard Deviation of the random sequence. A pure random sequence between 0.0 and 1.0 will have a SD of exactly one-half.  Any other number, even 0.501 or 0.49999999999 indicates bias or 'clustering'. Averaging over even a small sample set will show whether a bias exists.

OP : Maybe it's not random - it is a (crap) psuedo random algorithm - or your mind really is controlling the 8 bit realm?

Question for the EEs : would a noise diode on the analog pin provide a more randomised-random seed?

« Last Edit: April 05, 2023, 09:37:59 pm by AndyBeez »
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4034
  • Country: us
Re: Odds seem too high when guessing random zeroes and ones
« Reply #24 on: April 05, 2023, 09:12:41 pm »
I always thought modulo division is the proper way to bound the randomness of a big number to a given range.

Are you saying that if we divide a big pseudo-random number by n, the remainder is no longer a pseudo-random [0..n)?  I don't understand, why/how the randomness would be lost?

This depends strongly on the type of RNG.  But for "traditional" algorithms like a linear congruential generator (still used especially where computing power is low and by lazy developers), you always want to use floor division rather than remainder.  The reason is that these PRNGs use multiplication and addition, and the carry values propagate only to the left.  If you do modulo division you are looking mostly at the low bits that often have much shorter cycles and have much more obvious patterns.  In the extreme case if you use (prng() % 2) and the random number generator library is naively giving you the entire internal state as the random number, rather than only the most significant bits, you can get a "random" bit that always alternates 010101...

Note that I do not think this is the problem here.  What is described is a well known cognitive bias, and even most very poor random number generators will produce sequences not easily noticed by humans playing a guessing game.  But it is certainly possible to misuse a library PRNG and get extremely short cycles.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf