Electronics > Projects, Designs, and Technical Stuff
Curve fitting question
<< < (8/8)
rs20:

--- Quote from: rhb on March 13, 2019, 11:30:35 pm ---Learning to use the tool is what's important.  It makes it very easy to examine the residual error of the fit.

My preferred practice is to plot the data divided by the approximation.  I started doing that when I was working on the equations of state of reservoir fluids.  It's pretty much the standard way that thermodynamic properties are handled.  It has the great virtue of making errors in your choice of approximation and  data outliers obvious.  The cost of making measurements over a wide range of temperatures and pressures make experimental data scarce in that field.

And it will generate publication quality EPS for inclusion in a document.

--- End quote ---

What's your strategy to avoid overfitting (or put another way, what's your strategy for distinguishing noise from signal?)
rhb:
A general answer to that is difficult.  Usually I have had some basic physics to guide me.  I intensely dislike neural nets and such because they hide the actual connection to reality.

At the moment looking at the spectrum of the residual error to see if it approximates expected noise processes is the best response I can think of.  DeBoor talks about that in his book on spline fitting.

My preference is  having a huge amount of data, generate a probability density function and fit that.  I'll ponder the matter and see what I think of in the morning.
basinstreetdesign:
Yeah, I expect the jaggedness of the data is due to random errors of some sort.  But when presented with a problem like this I usually turn to a tool that I got for free a while ago.

A US professor had created a methodology not for fitting a data set to a given function but for finding an optimal function to fit through the data set.  He was the speaker in a fascinating YT vid on the subhect.  He called it Formulize and offerred it for FREE to all comers but without a couple of enhancements.  Then changed the name to Eureqa and kept improving it.  That's when I got a copy of it for WIN 7.  In the last 2-3 years his outfit must have been sold to another that markets it under the name of Nutonian and they now still sell copies of it for cash only - sigh, but a 30-day free trial is still available.

Anyway, I put your 32 point dataset into my copy and this is what it came up with in 10 minutes.  Errors were input as percent figures.  First your data,
Then the result.  The top right hand chart is a comparison of the error values (red line) against the Vin data as compared with the generated function (blue dots).  The function displayed is the best fit of several functions that it tried in the list to the left.

The function shown is:
0.0923823416426414 + 0.00055657868106123*Vin^2 + 3.64813221897325e-11*Vin^5 - 0.0337946466493136*Vin - 2.31333395165299e-16*Vin^7 - 3.06788272866507e-6*Vin^3 = Error

It has a "R^2 goodness of fit" of 0.8809998, a correlation coefficient of 0.94465699 and Mean Square Error of 0.0058587105

I limited the function generated to constants, arithmetic, trig, powers, log and root terms.  I have tried allowing others including inverse trig and hyperbolic trig, logical and squashing functions but produced no better results in 10 min.

Even if this is useless it at least allowed me to play with this tool and to kill half an hour.
Cheers
rhb:
What are the physics of the device you are trying to calibrate?  That will tell you what functional forms to consider. 

As an example, if you are trying to calibrate a signal path where you have connectors at a certain distance, then you expect to have some ringing due to reflections between the connectors.  So that requires a functional form which includes that with the reflection coefficients and the delay as the fitting variables.

Trying functions until you find one that fits is *not* a good idea.  Also you really do need a *lot* more data.

The king of the hill is the "Dantzig selector" presented by Emmanuel Candes.  You create a large matrix of possible solutions, A, collect some data, y, and solve Ax=y for a sparse x using an L1 solver such as linear programming.

https://statweb.stanford.edu/~candes/papers/DantzigSelector.pdf

It is far superior to any L2 based method and particularly appropriate wee the data are known to be a superposition of several functions.  It handily solves problems that Foreman Acton advised his readers not to attempt in "Numerical Methods That Work".  The easiest explanation of the concepts is the section on matching and basis pursuit in Mallat's 3rd ed of "A Wavelet Tour of Signal Processing".

I was doing that when I realized I was doing things I'd been taught could not be done.  Which is what started me on a 3 year, 3000 page mathematical odyssey.
tggzzz:

--- Quote from: rhb on March 15, 2019, 12:21:15 pm ---What are the physics of the device you are trying to calibrate?  That will tell you what functional forms to consider. 

As an example, if you are trying to calibrate a signal path where you have connectors at a certain distance, then you expect to have some ringing due to reflections between the connectors.  So that requires a functional form which includes that with the reflection coefficients and the delay as the fitting variables.

Trying functions until you find one that fits is *not* a good idea.  Also you really do need a *lot* more data.

--- End quote ---

That accurate and extremely pertinent point has been made at least twice in this thread. It may not have been understood :(

https://www.eevblog.com/forum/projects/curve-fitting-question/msg2249952/#msg2249952
https://www.eevblog.com/forum/projects/curve-fitting-question/msg2250018/#msg2250018 (as usual, XKCD summarises the issue pithily)
Navigation
Message Index
Previous page
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod