Author Topic: Standard deviation of the mean - what does this mean?  (Read 5891 times)

0 Members and 1 Guest are viewing this topic.

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Standard deviation of the mean - what does this mean?
« on: December 23, 2018, 06:56:04 am »
When you study how to perform uncertainty analysis on a measurement, you'll come across a guide that tells you that you can use the standard deviation of the mean of your readings as your Type A repeatability component. This is your calculated sample standard deviation divided by the square root of the number of readings you have made.

This has always confused me because sample time is cheap and so stating that performing excess sampling on your DUT can reduce Type A uncertainty to almost nil seems wrong. But then I realized something that doesn't seem to be common knowledge in the metrology community: your readings from a voltmeter are already means, so you don't get to calculate a standard deviation of the mean of the means. I *think* the guidance in the GUM is stating that your samples' standard deviation calculation is actually the standard deviation of the mean already.

I was, and still am, unsure what is meant by the phrase "standard deviation of the mean", so I did some calculations in an excel spreadsheet to try to clear things up for myself:
 
Suppose you had 10,000 samples of a voltage. You could calculate the mean and standard deviation. Or suppose you took 10 sets of 1,000 samples from the same data set. You could calculate the mean of each set and the standard deviation of those means. Or you could have 100 sets of 100 samples and do the same, or 10 sets of 1,000 samples, or 2 sets of 5,000 samples or whatever other combination you desire. The result is that your mean will remain relatively constant but your standard deviation will obviously diminish (because the standard deviation between averages of samples will be smaller than the standard deviation of the population of samples) and will close in on some number. This number happens to be the standard deviation of the mean. Meaning, you can calculate the standard deviation of the 10,000 samples and divide this by the square root of the number of samples and it would roughly be equal to calculating the standard deviation of 10 averages of sets of 1,000 samples of the same data set.

So my argument, and perhaps it is already common knowledge and just not in my circle, is that the guidance to use standard deviation of the mean in our uncertainty calculations is a bit misleading. It should state that our calculations of standard deviation of samples is really a standard deviation of the mean.

It is Saturday night and I'm making a post about this. What the hell have I done with my life.



 

Offline thermistor-guy

  • Frequent Contributor
  • **
  • Posts: 398
  • Country: au
Re: Standard deviation of the mean - what does this mean?
« Reply #1 on: December 23, 2018, 08:12:20 am »
When you study how to perform uncertainty analysis on a measurement, you'll come across a guide that tells you that you can use the standard deviation of the mean of your readings as your Type A repeatability component. This is your calculated sample standard deviation divided by the square root of the number of readings you have made.

This has always confused me because sample time is cheap and so stating that performing excess sampling on your DUT can reduce Type A uncertainty to almost nil seems wrong...

It is wrong. Your measurement samples of the DUT have a type A and a type B component. Repeated sampling gives you better bounds on the type A (statistical) repeatability - that is, on the probability distribution underlying your measurements.

Ideally, for a stable measurement setup, and with an increasing number of measurements, the std dev of your readings converges towards a certain value, characteristic of the random processes at work in your measurement setup. In general, that value won't be zero.

That still leaves the type B component - the systematic error in your setup. Excess sampling won't reduce it.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #2 on: December 23, 2018, 09:09:14 am »
As I stated, I'm referring to Type A uncertainty. The guidance allows you to reduce the Type A to a negligible amount simply by taking more samples. Sometimes, Type A uncertainty swamps the Type Bs, so reducing this because I just take more samples seems wrong to me and I'm trying to figure out why.
 
The following users thanked this post: joseph nicholas

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #3 on: December 23, 2018, 09:14:59 am »
To add, if you choose to calculate the standard deviation of the mean, your Type A will always converge to 0 given an infinite amount of samples. The sample standard deviation would not, as you describe. 
 

Offline thermistor-guy

  • Frequent Contributor
  • **
  • Posts: 398
  • Country: au
Re: Standard deviation of the mean - what does this mean?
« Reply #4 on: December 23, 2018, 09:43:22 am »
To add, if you choose to calculate the standard deviation of the mean, your Type A will always converge to 0 given an infinite amount of samples...

This is where we disagree.

Take a six-sided die. Throw it  and read the top of the die (your measurement) a few times. Calculate the mean and std dev.

Are you saying that, with more and more throws, the calculated std dev converges to zero?
 
The following users thanked this post: e61_phil

Offline bsdphk

  • Regular Contributor
  • *
  • Posts: 216
  • Country: dk
Re: Standard deviation of the mean - what does this mean?
« Reply #5 on: December 23, 2018, 09:50:33 am »
This is your calculated sample standard deviation divided by the square root of the number of readings you have made.

You are mixing things up here.

Standard deviation is what it is, you don't divide it by anything.

The divide by sqrt(N) is an estimator for how much white-noise you have left if you average N samples, and yes, given enough samples you can almost make white noise go away, but there are some pretty big footnotes, one of which is that your average i static and stable. 
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #6 on: December 23, 2018, 09:53:13 am »
To add, if you choose to calculate the standard deviation of the mean, your Type A will always converge to 0 given an infinite amount of samples...

This is where we disagree.

Take a six-sided die. Throw it  and read the top of the die (your measurement) a few times. Calculate the mean and std dev.

Are you saying that, with more and more throws, the calculated std dev converges to zero?

The standard deviation will not converge to zero. The standard deviation of the mean will converge to zero.

If you take f(n) = 1/SQRT(n) with n = number of samples, then it is a mathematical certainty that the function will converge to zero as n increases.

 
The following users thanked this post: e61_phil

Offline Kleinstein

  • Super Contributor
  • ***
  • Posts: 15356
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #7 on: December 23, 2018, 09:55:03 am »
The simple formula is valid for independent readings, which is generally not exactly the case. This often leads to a high estimate for the type A error.

As a common effect there is usually some extra low frequency noise or a kind of random drift type background. If one take more samples over a long time this often leads to the root mean square of the readings going up with time up to the point of compensating the 1 over square root of the number of readings part.

A way to look at it is using the Allan deviation to shown the correlation over longer time.  Initially the curve usually goes down, which shows that averaging helps, but for very long times the curve tends to flatten or even goes up again, as other effects can take over.

In some cases on can use averaging over very long time, if there are no interfering effect. However the time needed goes up quite fast as the square of the added resolution - so there are limits to it.  I ones used an Lockin amplifier with integration over 1000 seconds for each point - so a single curve needed a whole weekend.  The noise reduction by averaging still worked but it is slow.

P.S. The estimator for the std. dev of the mean has an square root of (N-1) and not N. This is because the mean values is also calculated form those numbers.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #8 on: December 23, 2018, 09:56:38 am »
This is your calculated sample standard deviation divided by the square root of the number of readings you have made.

You are mixing things up here.

Standard deviation is what it is, you don't divide it by anything.

The divide by sqrt(N) is an estimator for how much white-noise you have left if you average N samples, and yes, given enough samples you can almost make white noise go away, but there are some pretty big footnotes, one of which is that your average i static and stable.

Sorry, but could explain what you mean by white noise? And why dividing your std dev by sqrt(n) represents this amount?
 

Offline thermistor-guy

  • Frequent Contributor
  • **
  • Posts: 398
  • Country: au
Re: Standard deviation of the mean - what does this mean?
« Reply #9 on: December 23, 2018, 10:12:11 am »
.. given enough samples you can almost make white noise go away ...

In other words, averaging lots of readings gives you get a better estimate of the measurement's mean value.

So, returning to the die thought experiment: you throw it many times, and average the readings to reduce the randomness. With more throws, you get a better and better estimate of the mean.

Ok, now I think I understand the OP's point. There is an uncertainty in the mean estimate of the die, which in principle you can reduce to near-zero by a large number of samples. The random process, of the die throw itself, has a variance which doesn't change. But your estimate of the mean keeps improving, in principle. I was confusing the two concepts.
 
The following users thanked this post: Moon Winx

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 963
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #10 on: December 23, 2018, 10:39:36 am »
Very nice thread :)

I think that confuses many here or they don't think about it at all. The standard deviation of the samples itself doesn't mean much. (I know it was already explaind here, but again: ) It should seem suspicious if 10 readings with 100NPLC come to another uncertainty than 100 readings with 10NPLC (epecially, because most of the DMMs do 10 time 10NPLC if you configure 100NPLC).

I like the idea of confidence intervalls and the width of these intervals will go down with more and more readings (only statistical part) even if the std. deviation itself stays.
 

Offline bsdphk

  • Regular Contributor
  • *
  • Posts: 216
  • Country: dk
Re: Standard deviation of the mean - what does this mean?
« Reply #11 on: December 23, 2018, 10:52:18 am »
Sorry, but could explain what you mean by white noise? And why dividing your std dev by sqrt(n) represents this amount?

I think what you need is a good and approachable text-book on statistics, and if you don't want to stare down a lot of dry theory, I'll recommend Larry Goonicks "Cartoon Guide to Statistics", which is a really good introduction to statistics which everybody can read and understand.
 
The following users thanked this post: TiN

Offline thermistor-guy

  • Frequent Contributor
  • **
  • Posts: 398
  • Country: au
Re: Standard deviation of the mean - what does this mean?
« Reply #12 on: December 23, 2018, 11:55:35 am »
Very nice thread :)
... The standard deviation of the samples itself doesn't mean much...

In this context, yes.

Imagine you have two voltage references. In one situation, you want to compare their mean values to a third, better ref. Which ref has the more accurate mean? Each of the measured DUT means has an uncertainty due to random fluctuations, but also errors due to (slightly) imperfect measurement technique.

In a second situation, you as a designer or user want to answer the question: which voltage ref. has lower noise? Maybe the two refs are similar, but you made a design change to one, and you want to test if the change actually improved the noise characteristics. You want to estimate the variance of both refs, this time - not their means.

Those variance estimates will have uncertainties, again due to randomness and to imperfect measurement technique.

Two measurement situations, but with different goals, and different use of "uncertainty". I tend to be focused on the second situation, but here I confused it with the first. A subtle business.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #13 on: December 23, 2018, 07:08:31 pm »
The simple formula is valid for independent readings, which is generally not exactly the case. This often leads to a high estimate for the type A error.

As a common effect there is usually some extra low frequency noise or a kind of random drift type background. If one take more samples over a long time this often leads to the root mean square of the readings going up with time up to the point of compensating the 1 over square root of the number of readings part.

A way to look at it is using the Allan deviation to shown the correlation over longer time.  Initially the curve usually goes down, which shows that averaging helps, but for very long times the curve tends to flatten or even goes up again, as other effects can take over.

In some cases on can use averaging over very long time, if there are no interfering effect. However the time needed goes up quite fast as the square of the added resolution - so there are limits to it.  I ones used an Lockin amplifier with integration over 1000 seconds for each point - so a single curve needed a whole weekend.  The noise reduction by averaging still worked but it is slow.

I think my objection to the accepted way to apply the sample std dev of the mean is that it allows the practical zeroing out of Type A uncertainty. Meaning, regardless of how noisy or drifty the DUT is, the effect of this is never captured in the measurement uncertainty.

The Allan deviation curve would always converge to zero given enough time if you used the standard deviation of the mean to calculate the points, right? Even if the typical curve bends up after a certain amount of time, as long as it doesn't rise at a rate above sqrt(N), your curve would continuously drop and the optimal sample time would be obscured.

P.S. The estimator for the std. dev of the mean has an square root of (N-1) and not N. This is because the mean values is also calculated form those numbers.

I've never seen this, and I'm not saying you are wrong but could you point to somewhere that shows this? I've always seen sqrt(N) regardless of what type of standard deviation is calculated.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #14 on: December 23, 2018, 07:13:50 pm »
Sorry, but could explain what you mean by white noise? And why dividing your std dev by sqrt(n) represents this amount?

I think what you need is a good and approachable text-book on statistics, and if you don't want to stare down a lot of dry theory, I'll recommend Larry Goonicks "Cartoon Guide to Statistics", which is a really good introduction to statistics which everybody can read and understand.

I don't have a problem with understanding statistics. I'm curious about how you relate standard deviation of the mean to a representation of white noise and would like you to unpack that a bit if you don't mind.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #15 on: December 23, 2018, 07:15:34 pm »
Ok, now I think I understand the OP's point. There is an uncertainty in the mean estimate of the die, which in principle you can reduce to near-zero by a large number of samples. The random process, of the die throw itself, has a variance which doesn't change. But your estimate of the mean keeps improving, in principle. I was confusing the two concepts.

Excellent illustration!
 

Offline Kleinstein

  • Super Contributor
  • ***
  • Posts: 15356
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #16 on: December 23, 2018, 09:14:30 pm »
....
P.S. The estimator for the std. dev of the mean has an square root of (N-1) and not N. This is because the mean values is also calculated form those numbers.

I've never seen this, and I'm not saying you are wrong but could you point to somewhere that shows this? I've always seen sqrt(N) regardless of what type of standard deviation is calculated.

The difference between the std. deviation of the sample compared to the estimator for the std. deviation of the whole set is explained in Wikipedia:
https://en.wikipedia.org/wiki/Standard_deviation#Corrected_sample_standard_deviation

The unbiased version (at the end) is new to me. This beyond what one normally learns at school. So the Wiki article seems to be quite advanced and may not be easy to understand.
 

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 963
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #17 on: December 23, 2018, 09:43:52 pm »
Are you talking about N-1 in the equation for the std. Deviation or in the std. dev. of the mean?

My understanding so far was, that N-1 is used for std. dev. if your sample doesn't cover the whole population.

And I think if you have only a small sample size compared to the whole population, like in measurements, you have to use the student-t distribution to guess the std. dev.
 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #18 on: December 24, 2018, 12:13:27 am »
Are you talking about N-1 in the equation for the std. Deviation or in the std. dev. of the mean?

My understanding so far was, that N-1 is used for std. dev. if your sample doesn't cover the whole population.

And I think if you have only a small sample size compared to the whole population, like in measurements, you have to use the student-t distribution to guess the std. dev.

Yes, sample standard deviation (n-1 denominator) is used when you don't have access to the entire population. Obviously the population standard deviation (n denominator) is used when you do.

The standard deviation of the mean can be calculated from either, but as far as I know you divide the standard deviation by sqrt(n) regardless of whether you calculated a population or a sample standard deviation. Hence my question to Kleinstein's assertion to use sqrt(n-1) for sample standard deviations.

Regarding student-T, it is my opinion that you should always use student-T to calculate your repeatability/Type A uncertainty regardless of sample size. It covers small and large sample sizes, and the larger the sample size the more it resembles a normal distribution, hence there really is no reason not to always use it.


 

Offline Moon WinxTopic starter

  • Regular Contributor
  • *
  • Posts: 83
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #19 on: December 24, 2018, 12:31:50 am »
Here is an output of a program I whipped up to teach myself about std dev of the mean. I simulated 10,000 dice rolls using two dies. The program cut up the 10,000 results into sets of equal size, calculated the mean of each set and the standard deviation of these means. The program tested all possible set sizes. See if you can spot a pattern here.

Code: [Select]
Set contains 10000 averages of groups of 1
Group averages average: 7.032
Group averages std dev: 2.416
Group averages std of mean: 0.024 (sqrt of n = 100.000)

Set contains 5000 averages of groups of 2
Group averages average: 7.032
Group averages std dev: 1.695
Group averages std of mean: 0.024 (sqrt of n = 70.711)

Set contains 2500 averages of groups of 4
Group averages average: 7.032
Group averages std dev: 1.189
Group averages std of mean: 0.024 (sqrt of n = 50.000)

Set contains 2000 averages of groups of 5
Group averages average: 7.032
Group averages std dev: 1.058
Group averages std of mean: 0.024 (sqrt of n = 44.721)

Set contains 1250 averages of groups of 8
Group averages average: 7.032
Group averages std dev: 0.821
Group averages std of mean: 0.023 (sqrt of n = 35.355)

Set contains 1000 averages of groups of 10
Group averages average: 7.032
Group averages std dev: 0.754
Group averages std of mean: 0.024 (sqrt of n = 31.623)

Set contains 625 averages of groups of 16
Group averages average: 7.032
Group averages std dev: 0.569
Group averages std of mean: 0.023 (sqrt of n = 25.000)

Set contains 500 averages of groups of 20
Group averages average: 7.032
Group averages std dev: 0.527
Group averages std of mean: 0.024 (sqrt of n = 22.361)

Set contains 400 averages of groups of 25
Group averages average: 7.032
Group averages std dev: 0.497
Group averages std of mean: 0.025 (sqrt of n = 20.000)

Set contains 250 averages of groups of 40
Group averages average: 7.032
Group averages std dev: 0.400
Group averages std of mean: 0.025 (sqrt of n = 15.811)

Set contains 200 averages of groups of 50
Group averages average: 7.032
Group averages std dev: 0.364
Group averages std of mean: 0.026 (sqrt of n = 14.142)

Set contains 125 averages of groups of 80
Group averages average: 7.032
Group averages std dev: 0.279
Group averages std of mean: 0.025 (sqrt of n = 11.180)

Set contains 100 averages of groups of 100
Group averages average: 7.032
Group averages std dev: 0.242
Group averages std of mean: 0.024 (sqrt of n = 10.000)

Set contains 80 averages of groups of 125
Group averages average: 7.032
Group averages std dev: 0.229
Group averages std of mean: 0.026 (sqrt of n = 8.944)

Set contains 50 averages of groups of 200
Group averages average: 7.032
Group averages std dev: 0.187
Group averages std of mean: 0.026 (sqrt of n = 7.071)

Set contains 40 averages of groups of 250
Group averages average: 7.032
Group averages std dev: 0.168
Group averages std of mean: 0.027 (sqrt of n = 6.325)

Set contains 25 averages of groups of 400
Group averages average: 7.032
Group averages std dev: 0.118
Group averages std of mean: 0.024 (sqrt of n = 5.000)

Set contains 20 averages of groups of 500
Group averages average: 7.032
Group averages std dev: 0.106
Group averages std of mean: 0.024 (sqrt of n = 4.472)

Set contains 16 averages of groups of 625
Group averages average: 7.032
Group averages std dev: 0.096
Group averages std of mean: 0.024 (sqrt of n = 4.000)

Set contains 10 averages of groups of 1000
Group averages average: 7.032
Group averages std dev: 0.082
Group averages std of mean: 0.026 (sqrt of n = 3.162)

Set contains 8 averages of groups of 1250
Group averages average: 7.032
Group averages std dev: 0.063
Group averages std of mean: 0.022 (sqrt of n = 2.828)

Set contains 5 averages of groups of 2000
Group averages average: 7.032
Group averages std dev: 0.073
Group averages std of mean: 0.033 (sqrt of n = 2.236)

Set contains 4 averages of groups of 2500
Group averages average: 7.032
Group averages std dev: 0.030
Group averages std of mean: 0.015 (sqrt of n = 2.000)


As you can tell, the standard deviation of the mean stays relatively constant regardless of how we manipulate our samples and averaging. Knowing this, we should assume that our sample standard deviation of meter readings is already representing the standard deviation of the mean and should _not_ divide this result by sqrt(n). In other words, we should not be able to manipulate our Type A uncertainties by how we choose the meter averaging/NPLCs combined with how many samples we take from the meter. We should always assume the meter is returning a mean, so our regular std dev calculations of these means already represents the standard deviation of the mean.
 

Online tomato

  • Regular Contributor
  • *
  • Posts: 207
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #20 on: December 24, 2018, 02:50:43 am »
A way to look at it is using the Allan deviation to shown the correlation over longer time.  Initially the curve usually goes down, which shows that averaging helps, but for very long times the curve tends to flatten or even goes up again, as other effects can take over.

The Allan deviation curve would always converge to zero given enough time if you used the standard deviation of the mean to calculate the points, right?

It's not an Allan Variance if you do that.
 

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 963
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #21 on: December 24, 2018, 07:25:41 am »
At first: I agree that student-T can be used for every sample size. It converges into a normal distribution. Therefore, one can calculate with the easier normal distribution, if the sample size ist large enough.

As you can tell, the standard deviation of the mean stays relatively constant regardless of how we manipulate our samples and averaging. Knowing this, we should assume that our sample standard deviation of meter readings is already representing the standard deviation of the mean and should _not_ divide this result by sqrt(n). In other words, we should not be able to manipulate our Type A uncertainties by how we choose the meter averaging/NPLCs combined with how many samples we take from the meter. We should always assume the meter is returning a mean, so our regular std dev calculations of these means already represents the standard deviation of the mean.

I don't get your point here. You showed with your dice example, that std. dev. of smaller samples is higher. So why do you conclude, that it doesn't matter if you use 10 NPLC or 100 NPLC?

The advantage of the std. of the mean is you can make use of many samples. I did that for the measurement of the 1PPS signal from my GPS receiver. I averaged it over days and this will reduce the uncertainty of the mean with time (as long as the local oscillator ist stable enough, but that is another story). If you simply calculate the std. dev. of the sample it converges very quickly to a number (that's the idea of the std. dev). The standard deviation itself doesn't improve your uncertainty.

The same is with 10NPLC vs. 100NPLC if you measure over the same amount of time (10x 100NPLC or 100x 10NPLC) the std. dev. of the mean would get you to the same uncertainty. The std. dev. might differ.

Edit: Perhaps, we talking about different things. I you want to reflect the uncertainty of a single measurement with the used DMM at the DUT than one should add the std. dev. of the measurement itself to the sum of uncertainties. But that doesn't reflect what is possible with the setup.
« Last Edit: December 24, 2018, 07:46:48 am by e61_phil »
 

Offline Kleinstein

  • Super Contributor
  • ***
  • Posts: 15356
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #22 on: December 24, 2018, 10:41:02 am »
The question of dividing by N or N-1 comes up when one estimates the std. dev of the single readings. Even though one uses all readings one has, one should use the N-1 case here, as the estimate is needed to judge on possible further readings. Than calculating the (estimated) std. deviation of the mean, one gets another divide by square root of N, if the readings are independent. It the readings are not independent it is up to finding a suitable statistic model and than estimate the type A uncertainty of the mean in a possibly different way.

Given enough time one can usually reduce the type A uncertainty be repetition - this is a main reason why it is calculated separate. However with only the square root this is a slow process. With correlated readings (e.g. from 1/f type noise) the process tends to be even less efficient, or take more than just reading a meter several times.

The example with 100x10 PLC or 10x100PLC is a case that at first sight one expects the same noise level. However depending on the meter internals (e.g how AZ is handled) there can be a slight differences that is related with the samples no being fully independent. Especially Keithley meters tend to use some hidden filtering on the zero reading so that the readings are not Independent. This can lead to wrong and different estimates on the std. deviation, if ignored. It can also have an effect on the actual noise of the mean.

Anyway one quite often has enough readings that it does no make a big different using N or N-1 or maybe N-1.5. The noise estimate also usually does not has to be that accurate. Nobody really cares if it is 1 ppm  noise or 1.05 ppm.

It is still a good idea to check the actual scattering, some problems show up as higher or lower noise than normal.
 

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 963
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #23 on: December 24, 2018, 11:09:24 am »
Unfortunately I can't find the data for exact that example. I did some measurements with an Agilent 34401A connected to 10V from my Fluke 5440B a year ago.

The idea was to investigate exact this topic here. The measurement with 100x 10NPLC shows more std. dev. than 10x 100NPLC. The std. dev. of the mean was more or less equal.
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3531
  • Country: us
Re: Standard deviation of the mean - what does this mean?
« Reply #24 on: December 24, 2018, 10:35:41 pm »
I recommend:

Random Data
Bendat & Piersol
4th ed 2010

I've relied on it for 30 years starting with the 2nd ed.
 

Offline e61_phil

  • Frequent Contributor
  • **
  • Posts: 963
  • Country: de
Re: Standard deviation of the mean - what does this mean?
« Reply #25 on: December 25, 2018, 11:04:41 am »
Just for fun I created a number of normal distributed random numbers with a standard deviation of 1ppm. Each number represents a measurement with 10NPLC. With this numbers I build a second array with the mean of 10 sample chunks from the first array. This should represent the measurement of the same data with 100NPLC.

From both arrays the std. deviation and the std. dev. of the mean were calculated.

While the standard deviation quickly converges against a number. The std. deviation of the mean converges against zero.

Nothing new after this thread..
 

Offline pwlps

  • Frequent Contributor
  • **
  • Posts: 372
  • Country: fr
Re: Standard deviation of the mean - what does this mean?
« Reply #26 on: January 02, 2019, 04:46:42 pm »
Moon Winx is right:

Quote
As you can tell, the standard deviation of the mean stays relatively constant regardless of how we manipulate our samples and averaging. Knowing this, we should assume that our sample standard deviation of meter readings is already representing the standard deviation of the mean and should _not_ divide this result by sqrt(n). In other words, we should not be able to manipulate our Type A uncertainties by how we choose the meter averaging/NPLCs combined with how many samples we take from the meter. We should always assume the meter is returning a mean, so our regular std dev calculations of these means already represents the standard deviation of the mean.

If you take a sum of N independent standard (gaussian) distributions each of variance S you get a standard distribution of variance S*sqrt(N), therefore an average of N standard distributions will have a variance S/sqrt(N).  Now if you divide N in N1 batches of N2 samples each (N=N1*N2) then the average of each batch will have a variance S/sqrt(N2), and repeating the process the average of averages will have a variance (S/sqrt(N2)/sqrt(N1)=S/sqrt(N1*N2).  Therefore the variance does not depend on how you divide the population and as you say "our regular std dev calculations of these means already represents the standard deviation of the mean".

That said, we have in principle to be careful here because the standard deviation (with sqrt(N-1) ) is not a distribution parameter like the variance: it is only an estimator of the variance and is itself a statistical variable (with a chi_squared distribution).  Actually the standard deviation of a sum of N independent variables also varies like sqrt(N) times the standard deviation of each of them therefore the formulas used above for the variance apply here, and the standard deviation should not depend either on how you divide the population.
 Note that in real life the variables might not be quite independent (for example if there is a slow oscillation in the measured values), then we have to use the formula (https://en.wikipedia.org/wiki/Standard_deviation):

stdev(X+Y)=sqrt(stdev(X)^2+stdev(Y)^2+cov(X,Y))

If the covariance is not zero then the obtained std dev won't give a good approximation of the variance anymore and will probably depend on the size of th batches.
 
The following users thanked this post: Moon Winx


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf