Author Topic: Curve fitting question  (Read 6392 times)

0 Members and 1 Guest are viewing this topic.

Offline AlcidePiR2

  • Regular Contributor
  • *
  • Posts: 95
  • Country: fr
Re: Curve fitting question
« Reply #25 on: March 08, 2019, 08:00:18 am »
@Jester : You need to do at least an additional experiment with the full curve again. Even better a few.

This will  tell you the noise level in your experiment.
 
The following users thanked this post: Jester

Offline Berni

  • Super Contributor
  • ***
  • Posts: 5050
  • Country: si
Re: Curve fitting question
« Reply #26 on: March 08, 2019, 10:27:39 am »
Curve fitting onto noisy data will also try to reproduce the noise the best it can. It has no way of knowing what is noise and what is signal. Just that usually by limiting the number of elements in the polynomial you smooth it out somewhat by limiting its ability on fine detail reproduction.

If you want the same smoothing effect with a lookup table then you just run the whole thing trough a filter that smooths out the points. Basically like bluring on a image but on 1D instead. The end result is the same, just a different process to get the smoothing effect.

Not saying polynomial curve fitting is bad, just saying that its much better suited to very smooth graphs. If the graph has only a 2nd order curve to it along its whole size and you have 100 points then yes a polynomial curve fit is an excellent way to clean it up as a low order polynomial will average out the noise due to its inability to take sharp turns and will be computationally nice for lookup due to being short. But the graph that the OP is showing is not a smooth graph at all and i assume he wants his curve fit to look similar to the interpolated curve that excel draws (And excel likely did it with cubic interpolation or similar for efficiency reasons)

But if we are talking about the actual data, the graph in the original post does look like it should have more points in it. Some points have significant jumps so you probably want some extra points in there. As for noise, that's the OPs job to make sure his test setup makes a graph sufficiently accurate to begin with. No amount of math will magically remove noise, not even polynomials. You can just mask the noise a bit by smoothing. The proper way to 'remove noise' is taking many passes at the measurement and then averaging the results. That is provided your DUT doesn't drift between the test runs, but if it does drift then correcting it using calibration data won't work anyway.
« Last Edit: March 08, 2019, 10:29:10 am by Berni »
 
The following users thanked this post: Jester

Offline JesterTopic starter

  • Frequent Contributor
  • **
  • Posts: 898
  • Country: ca
Re: Curve fitting question
« Reply #27 on: March 09, 2019, 02:56:07 pm »
This is such a great forum from the sharing of expertise perspective. Big thanks to all of you.

Noise was certainly a factor, as well as the relative error.  I broke the curve up into a few segments and then remeasured the outliers and found that on average they were actually closer to the trend. The function finder in ZunZun is great in that it lets you see the error for a multitude of solutions, polynominal solutions were usually not great.

Fortunately I'm using a 32bit uC, so some number crunching is not much of an impediment. I'm getting great corrected results now 0-250V ac and DC, will be moving on to current measurement later today. I'm using a LEM DCCT, and I'm anticipating a fair bit of drift, hopefully I can correct based on temperature in the box.
 
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Curve fitting question
« Reply #28 on: March 11, 2019, 07:09:29 am »
Did some linear regression for the error term, and the results look suspiciously much like an ADC non-linearity error function. Complete with excursion within 1 LSB and some semi-random walking of the residue. That residue will contain a good bit of noise, but I would not be at all surprised if there's some zig-zag-zag-ziggy-zaggy pattern in there at multiples of 16 or 32 or something similar.

Below the result when using the data-points from 10 to 200, so excluding that big jump at 210. Not so much for numerical reasons but more to make it a bit easier to try and spot any potential periodicity in the residue.


And here the full range is used, so including the fairly big jump at 210. The curve fit is still fairly similar to the previous one due to the use of l1-minimization. That is far less sensitive to outliers than ye olde l2-norm least squares.
« Last Edit: March 11, 2019, 07:24:51 am by mrflibble »
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Curve fitting question
« Reply #29 on: March 11, 2019, 07:49:19 am »
Noise was certainly a factor, as well as the relative error.  I broke the curve up into a few segments and then remeasured the outliers and found that on average they were actually closer to the trend. The function finder in ZunZun is great in that it lets you see the error for a multitude of solutions, polynominal solutions were usually not great.
If you are serious about spending more time & effort on making a predictor for the error, then you might want to consider getting more data. Those few samples are okay for doing some quick checks, but IMO not near enough for getting a result with decent statistical significance. Also, if you have a max budget of measurement points due to time constraints or whatever, you are better of generating them at random intervals. As opposed to 0,10,20,30,etc... For the same number of samples the non-uniform intervals will typically give you more information for this kind of thing. And as a bonus feature, by using random instead of fixed intervals, certain systematic errors are less likely to occur. And at no extra cost to the user. :)
 

Online golden_labels

  • Super Contributor
  • ***
  • Posts: 1472
  • Country: pl
Re: Curve fitting question
« Reply #30 on: March 11, 2019, 12:00:48 pm »
Nice observation, mrflibble. I would not be surprised if 200V is the effect of n-and-a-half-digit DMMs having one of their voltage ranges ending at 199.9…V. Which brings us to a question I already wanted to ask earlier: what about the measurement errors? Perhaps what was achieved in that topic is producing a calibration curve for the multimeter itself?
People imagine AI as T1000. What we got so far is glorified T9.
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3516
  • Country: us
Re: Curve fitting question
« Reply #31 on: March 11, 2019, 01:19:51 pm »
Try the curve fitting in gnuplot.  The Marquardt-Levenberger solver in it is fantastic.  I used that for all my professional work as a research scientist working for major and super major oil companies.  It also gives you publication quality figures.

 i was able to do accurate fits to  y=a*exp(b*x + c*x**2 + d*x**3) and higher order polynomial exponents.  It takes a bit of practice to do that, but it's easier than you'd think.  The key is to only solve for one term at a time until you've got a fit for all of them.  Then do fits for more than one.

It will blow up if you happen to choose the wrong combination of terms.  But usually you can do a last solution for all the coefficients if you've gotten close enough.  The higher the order, the closer you need to be.  Gnuplot also has lots of other curve fitting algorithms available.
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Curve fitting question
« Reply #32 on: March 13, 2019, 02:43:36 am »
Nice observation, mrflibble. I would not be surprised if 200V is the effect of n-and-a-half-digit DMMs having one of their voltage ranges ending at 199.9…V.
Agreed. That step at 200 Volt would suggest a ranging switch. If you check the graph then you can see that the zig-zag pattern is continuous ... except that at 200 V there is a different offset. So that step you see is the difference in offset of the two ranges used. Or at least, that would be my guess. And if this is indeed the case, it might be a good idea to redo the measurements with the DMM set to fixed range. That, and it would be interesting to see what influence temperature has on the result.

Quote
Which brings us to a question I already wanted to ask earlier: what about the measurement errors? Perhaps what was achieved in that topic is producing a calibration curve for the multimeter itself?
Yeah, incorporating the measurement uncertainty of the DMM into the calibration would make sense. That way you'd get some idea of how much confidence you can have in the end result of your calibration procedure. Mainly it depends on how much effort the OP would want to put into this.
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Curve fitting question
« Reply #33 on: March 13, 2019, 03:30:12 am »
Try the curve fitting in gnuplot.  The Marquardt-Levenberger solver in it is fantastic.  I used that for all my professional work as a research scientist working for major and super major oil companies.
Don't you think damped least squares is a tiny bit overkill for this? ;D The data would suggest either linear or piecewise linear, depending on required fit. If you check the plots a few posts back, you can see that the residue plot has a distinct sawtooth shape to it. Which to me would suggest using a piecewise linear function as curve fit.

Another reason for piecewise linear is that I'd guess that sawtooth shape is due to the hypothetical layout of the hypothetical ADC. To be fair, with this few datapoints this is a reasonably wild guess. At this point we're playing the "Guess my calibration procedure!" game, so could go either way. :-//
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3516
  • Country: us
Re: Curve fitting question
« Reply #34 on: March 13, 2019, 11:30:35 pm »
Learning to use the tool is what's important.  It makes it very easy to examine the residual error of the fit.

My preferred practice is to plot the data divided by the approximation.  I started doing that when I was working on the equations of state of reservoir fluids.  It's pretty much the standard way that thermodynamic properties are handled.  It has the great virtue of making errors in your choice of approximation and  data outliers obvious.  The cost of making measurements over a wide range of temperatures and pressures make experimental data scarce in that field.

And it will generate publication quality EPS for inclusion in a document.
 

Offline rs20

  • Super Contributor
  • ***
  • Posts: 2322
  • Country: au
Re: Curve fitting question
« Reply #35 on: March 15, 2019, 12:48:18 am »
Learning to use the tool is what's important.  It makes it very easy to examine the residual error of the fit.

My preferred practice is to plot the data divided by the approximation.  I started doing that when I was working on the equations of state of reservoir fluids.  It's pretty much the standard way that thermodynamic properties are handled.  It has the great virtue of making errors in your choice of approximation and  data outliers obvious.  The cost of making measurements over a wide range of temperatures and pressures make experimental data scarce in that field.

And it will generate publication quality EPS for inclusion in a document.

What's your strategy to avoid overfitting (or put another way, what's your strategy for distinguishing noise from signal?)
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3516
  • Country: us
Re: Curve fitting question
« Reply #36 on: March 15, 2019, 01:14:38 am »
A general answer to that is difficult.  Usually I have had some basic physics to guide me.  I intensely dislike neural nets and such because they hide the actual connection to reality.

At the moment looking at the spectrum of the residual error to see if it approximates expected noise processes is the best response I can think of.  DeBoor talks about that in his book on spline fitting.

My preference is  having a huge amount of data, generate a probability density function and fit that.  I'll ponder the matter and see what I think of in the morning.
 

Offline basinstreetdesign

  • Frequent Contributor
  • **
  • Posts: 466
  • Country: ca
Re: Curve fitting question
« Reply #37 on: March 15, 2019, 05:29:54 am »
Yeah, I expect the jaggedness of the data is due to random errors of some sort.  But when presented with a problem like this I usually turn to a tool that I got for free a while ago.

A US professor had created a methodology not for fitting a data set to a given function but for finding an optimal function to fit through the data set.  He was the speaker in a fascinating YT vid on the subhect.  He called it Formulize and offerred it for FREE to all comers but without a couple of enhancements.  Then changed the name to Eureqa and kept improving it.  That's when I got a copy of it for WIN 7.  In the last 2-3 years his outfit must have been sold to another that markets it under the name of Nutonian and they now still sell copies of it for cash only - sigh, but a 30-day free trial is still available.

Anyway, I put your 32 point dataset into my copy and this is what it came up with in 10 minutes.  Errors were input as percent figures.  First your data,
Then the result.  The top right hand chart is a comparison of the error values (red line) against the Vin data as compared with the generated function (blue dots).  The function displayed is the best fit of several functions that it tried in the list to the left.

The function shown is:
0.0923823416426414 + 0.00055657868106123*Vin^2 + 3.64813221897325e-11*Vin^5 - 0.0337946466493136*Vin - 2.31333395165299e-16*Vin^7 - 3.06788272866507e-6*Vin^3 = Error

It has a "R^2 goodness of fit" of 0.8809998, a correlation coefficient of 0.94465699 and Mean Square Error of 0.0058587105

I limited the function generated to constants, arithmetic, trig, powers, log and root terms.  I have tried allowing others including inverse trig and hyperbolic trig, logical and squashing functions but produced no better results in 10 min.

Even if this is useless it at least allowed me to play with this tool and to kill half an hour.
Cheers
« Last Edit: March 15, 2019, 05:56:46 am by basinstreetdesign »
STAND BACK!  I'm going to try SCIENCE!
 

Offline rhb

  • Super Contributor
  • ***
  • Posts: 3516
  • Country: us
Re: Curve fitting question
« Reply #38 on: March 15, 2019, 12:21:15 pm »
What are the physics of the device you are trying to calibrate?  That will tell you what functional forms to consider. 

As an example, if you are trying to calibrate a signal path where you have connectors at a certain distance, then you expect to have some ringing due to reflections between the connectors.  So that requires a functional form which includes that with the reflection coefficients and the delay as the fitting variables.

Trying functions until you find one that fits is *not* a good idea.  Also you really do need a *lot* more data.

The king of the hill is the "Dantzig selector" presented by Emmanuel Candes.  You create a large matrix of possible solutions, A, collect some data, y, and solve Ax=y for a sparse x using an L1 solver such as linear programming.

https://statweb.stanford.edu/~candes/papers/DantzigSelector.pdf

It is far superior to any L2 based method and particularly appropriate wee the data are known to be a superposition of several functions.  It handily solves problems that Foreman Acton advised his readers not to attempt in "Numerical Methods That Work".  The easiest explanation of the concepts is the section on matching and basis pursuit in Mallat's 3rd ed of "A Wavelet Tour of Signal Processing".

I was doing that when I realized I was doing things I'd been taught could not be done.  Which is what started me on a 3 year, 3000 page mathematical odyssey.
 

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 21225
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Curve fitting question
« Reply #39 on: March 15, 2019, 02:54:19 pm »
What are the physics of the device you are trying to calibrate?  That will tell you what functional forms to consider. 

As an example, if you are trying to calibrate a signal path where you have connectors at a certain distance, then you expect to have some ringing due to reflections between the connectors.  So that requires a functional form which includes that with the reflection coefficients and the delay as the fitting variables.

Trying functions until you find one that fits is *not* a good idea.  Also you really do need a *lot* more data.

That accurate and extremely pertinent point has been made at least twice in this thread. It may not have been understood :(

https://www.eevblog.com/forum/projects/curve-fitting-question/msg2249952/#msg2249952
https://www.eevblog.com/forum/projects/curve-fitting-question/msg2250018/#msg2250018 (as usual, XKCD summarises the issue pithily)
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf