Author Topic: Standard deviation of the mean - what does this mean?  (Read 4661 times)

0 Members and 1 Guest are viewing this topic.

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #25 on: December 25, 2018, 11:04:41 am »
Just for fun I created a number of normal distributed random numbers with a standard deviation of 1ppm. Each number represents a measurement with 10NPLC. With this numbers I build a second array with the mean of 10 sample chunks from the first array. This should represent the measurement of the same data with 100NPLC.

From both arrays the std. deviation and the std. dev. of the mean were calculated.

While the standard deviation quickly converges against a number. The std. deviation of the mean converges against zero.

Nothing new after this thread..
 

Offline pwlps

  • Frequent Contributor
  • **
  • Posts: 372
  • Country: fr
Re: Standard deviation of the mean - what does this mean?
« Reply #26 on: January 02, 2019, 04:46:42 pm »
Moon Winx is right:

Quote
As you can tell, the standard deviation of the mean stays relatively constant regardless of how we manipulate our samples and averaging. Knowing this, we should assume that our sample standard deviation of meter readings is already representing the standard deviation of the mean and should _not_ divide this result by sqrt(n). In other words, we should not be able to manipulate our Type A uncertainties by how we choose the meter averaging/NPLCs combined with how many samples we take from the meter. We should always assume the meter is returning a mean, so our regular std dev calculations of these means already represents the standard deviation of the mean.

If you take a sum of N independent standard (gaussian) distributions each of variance S you get a standard distribution of variance S*sqrt(N), therefore an average of N standard distributions will have a variance S/sqrt(N).  Now if you divide N in N1 batches of N2 samples each (N=N1*N2) then the average of each batch will have a variance S/sqrt(N2), and repeating the process the average of averages will have a variance (S/sqrt(N2)/sqrt(N1)=S/sqrt(N1*N2).  Therefore the variance does not depend on how you divide the population and as you say "our regular std dev calculations of these means already represents the standard deviation of the mean".

That said, we have in principle to be careful here because the standard deviation (with sqrt(N-1) ) is not a distribution parameter like the variance: it is only an estimator of the variance and is itself a statistical variable (with a chi_squared distribution).  Actually the standard deviation of a sum of N independent variables also varies like sqrt(N) times the standard deviation of each of them therefore the formulas used above for the variance apply here, and the standard deviation should not depend either on how you divide the population.
 Note that in real life the variables might not be quite independent (for example if there is a slow oscillation in the measured values), then we have to use the formula (https://en.wikipedia.org/wiki/Standard_deviation):

stdev(X+Y)=sqrt(stdev(X)^2+stdev(Y)^2+cov(X,Y))

If the covariance is not zero then the obtained std dev won't give a good approximation of the variance anymore and will probably depend on the size of th batches.
 
The following users thanked this post: Moon Winx


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf