And this is why a proper lesson in statistics is in place, because you are making a lot of mistakes mister.

You'd be wise to realise that part of the reason people don't follow your messages here is because your use of statistical language is so incredibly vague. Granted, I missed that what you had done was pre-multiply by 3* to give one of the most amateurish "tolerance intervals" I've seen in a while. But because you called it a "range" rather than a "tolerance interval" or anything else, I didn't interpret it as such, and was completely thrown.

* And then attach that to 5, rather than the sample mean? What? You can't honestly blame me for doubting your credentials at that point.

But most of all, you are totally missing the point of what a data sample is and how you can statistically use this to get a very good estimate for all possible outcomes.

Or in other words how a 32 data sample can be representative for a couple of hundreds (or even thousands) of multimeters.

Yep, I totally didn't get the hint that you were calculating tolerance intervals and calling them "ranges".

The absolute WRONG way of doing it, is the way you did it. Since you don't know if there will be other meters that will be worse than that or how many of them will.

Are you serious! I didn't do anything, other than state a number of facts that remain perfectly true: The standard deviation is X, the observed range is Y, and I couldn't understand what you were talking about. All true at the time.

Last point, yes by DEFINITION things are given in standard deviations.

Why? Because science is a statistical way of measuring things.

I have worked for companies that do calibrations and there standards work exactly this way.

But like I said before, most engineers are pretty sloppy using it correctly (or not at all actually)

No. The datasheet specs might be prepared behind the scenes using statistics similar to the statistics you described, but what they

mean is absolute limits that are not to be interpreted using assumptions of normal distribution, let alone standard deviations. Calibration of 9 digit super high-end multimeters, sure, there's going to be statistics involved there -- but I'm seeing no evidence at all that the basic datasheet specs of 121GW-level multimeters is anything other than guaranteed simple limits. Datasheet limits are often much looser than the samples statistics would suggest (take the results for this meter, for example), since future process changes could screw with things (a possibility that is completely beyond the ability of naive 3 sigma statistics to estimate.)

Given that it is a sample of meters from a large population one should compute the sample standard deviation rather than the normal standard deviation.

One should then quote the sample mean \$\bar x\$ and the sample standard deviation \$s\$ and allow the reader to do their own interpretation and estimates of confidence intervals as they wish.

True enough, I'd have been happy with either those raw statistics or an explicitly called-out TI.

Regarding sample standard deviation, I don't disagree (although I'm not sure the difference is non-negligible even with as few as 30 samples) -- however, there is yet another assumption that has been made here. What is

true is that for normally distributed data, 99.7% of

population items fall between

(

population mean) - 3 (

population standard dev) and (

population mean) + 3 (

population standard dev)

One of the most rife mistakes in statistics, IMHO, is when people take the

sample standard dev and plug it in to the formula above. You might be thinking "eh, what does it matter, it'll be a little bit high sometimes and a little bit low sometimes, so some intervals will be a little bit more generous and others will be a little bit too tight, but it'll all cancel out, right?" Answer: no. Turns out you make more mistakes with the erroneously tight intervals than can possibly be made up for by the generous intervals. This is where the t-distribution comes in; and is why you actually need to use Student's t-test in many situations that seem naively amenable to nonsense like 'take "the" mean and add/subtract 3 times "the" standard deviation'.

And everything above is all about generating a 99.7% confidence interval for the population

mean -- we haven't even started generating intervals that contains 99.7% of units in the population (which are called tolerance intervals rather than confidence intervals, by the way -- and that's not just a pedantic difference, the numbers are completely different -- CIs shrink with more data, TIs don't.)

Anyway, my point to all this is that whether the sample standard deviation should be used is a rather long way down the list of things that need to be sorted out here. (Edit: but to be clear, you're absolutely correct that the sample standard deviation is much more correct in most/all circumstances)