EEVblog Electronics Community Forum
Electronics => Beginners => Topic started by: jboard146 on February 29, 2024, 06:53:44 pm
-
I'm trying to wrap my head around the math around calibration tables, but don't know where to start. Google and ChatGPT aren't really a help probably due to my search/prompt.
For example if characterized a temperature sensor (it could be any sort of measured values). So i have a list of input and output values, but they are not linear. My real world example is measuring RF power which is a whole other rabbit hole so lets stick to a temperature sensor.
I'm assuming that when you have a calibration table you are trying to fit them into a linear line and not some polynomial function. Then code this into some sort of micro controller etc
I may be totally off here, but i think that is what i want to do.
I'm looking for a good reference book, video, web etc and/or what is this called in mathematics. A generic code example would also be great as well.
-
Would be helpful if you post an example of a calibration table and gave more detail about what you want to calibrate. The basic mathematical approach to finding a function to approximate tabular data is called curve fitting. Choice of function to fit can be via doing a plot of the data and deciding what function to use based on the apparent trend. Even better is if you know the underlying mechanism the data (i.e., table values) represent. The mechanism may align with a particular function. A straight line is often not satisfactory.
Mike
-
One common mathematical function to use in curve fitting is a polynomial, such as
y(x) = a0 + a1x + a2x2 + a3x3
where the as are the calibration constants.
This can work well to interpolate between the measured calibration values, but such a polynomial will always "turn around" outside the range used for the measured data, so it should not be used to extrapolate past that range.
-
I'm assuming that when you have a calibration table you are trying to fit them into a linear line and not some polynomial function. Then code this into some sort of micro controller etc
Linear would be nice, for sure. Over a sufficiently small range any smooth function can be treated as linear.
But if you have to deal with a wide range of conditions and the physics aren't linear -- then you either accept inaccuracy, or you use a more complex function.
If you're lucky then quadratic or cubic might be close enough. Piecewise cubic lets you match the slope as well as the value at both ends of each segment. But piecwise linear might be good enough.
You might want to approximate the inverse function rather than trying to solve the forward function.
Even a 16 MHz 8 bit AVR can do a lot of maths for each data point if if it only needs to do it 10 or 100 times a second. But these days the term "microcontroller" can include a 300 MHz 32 bit chip with hardware floating point.
-
You can learn about splines too.
https://en.m.wikipedia.org/wiki/Spline_(mathematics)
-
I use wxMaxima (http://wxmaxima-developers.github.io/wxmaxima/) or Maxima and gnuplot (http://www.gnuplot.info/) for most of my curve fitting needs. They are free and open source, and available for all operating systems. wxMaxima is the graphical user interface for Maxima, a computer algebra system. Gnuplot is a tool for generating various kinds of plots, but can also do least-squares one-dimensional curve fitting (using nonlinear least-squares Marquardt-Levenberg algorithm) quite well.
If you have N values –– that is, N pairs of x and f(x) ––, then an N-1'th degree polynomial gives an exact fit.
For example, let's say you have a table of four points:
x f(x)
25 100
50 120
75 145
100 185
This tells us a third degree polynomial can describe this exactly. In Maxima, we can find it out using
f(x) := C0 + C1*x + C2*x^2 + C3*x^3;
solve([ f(25) = 100, f(50) = 120, f(75) = 145, f(100) = 185 ], [ C0, C1, C2, C3 ]);
Note that there are four coefficients C0 through C3. This is plain linear algebra, and by hand the solution is obtained by solving the system of equations, often in matrix form. However, Maxima will tell us the solution coefficients:
[[ C0 = 75, C1 = 37/30, C2 = -3/250, C3 = 1/9375 ]]
If you want them in decimal form, use for example float(%); which repeats the results as
[[ C0 = 75.0, C1 = 1.23333333333, C2 = -0.012, C3 = 1.066666667e-4 ]]
Thus, the function we are looking for is exactly
f(x) := 75 + 37*x/30 - 3*x*x/250 + x*x*x/9375;
To test, if you try f(25); f(50); f(75); f(100);, you will get the expected results, 100, 120, 145, and 185, respectively. f(20) will yield 2393/25, or (after float(%);) 95.72.
You can use standard functions in the function definition, in which case it becomes a system of N equations with N unknowns, which may or may not have a solution or more than one solution. If Maxima responds with just [], it means it couldn't find an exact solution. If may also return several solutions, with %1, %2 as free variables (i.e., any value for them yields a valid solution). For example, if you ask
solve(x^2 - 4 = 0, x);
it will tell you the two solutions,
[x = -2, x = 2]
If I had the dosh, I might buy a license for Maple (https://www.maplesoft.com/products/Maple/), because its optimizer and solver are slightly better, and I often play with annoyingly complex functions where I need to do transforms myself to get Maxima to find a solution. Back in my Uni days, I loved to use Maple for that stuff (molecular dynamics potential models, forcefields, et cetera).
When you have many more data points to fit to, an exact function is often not realistic. Instead, you want an approximate function, that reproduces the data points as close as possible. The trick is how to define that "close", when you have a set of distances, and different methods for measuring the distance. A typical definition of the distance is residual, which is just the difference between the fitted function value and the data set value for each data set point x. (That means, the distance is only measured along the f(x) axis, or vertically in typical plots.) A typical definition for "as close as possible" is least squares, which finds the fit that yields the smallest sum of each residual squared, for each of the given data points. This is what is called "least squares fitting".
Let's say we wanted to fit f(x) = C0 + C1*x + C2*x*x to the above dataset using least squares fitting. It cannot exactly reproduce the data points, but it should get close. With gnuplot, we first prepare the data set, by creating a plain text file with one data point per line. (Lines that begin with a # are comment lines, and gnuplot will ignore those.) For example, let's say we create data.txt with
# x f(x)
25 100
50 120
75 145
100 185
The first line is just a comment line for us humans. We then start gnuplot, and tell it:
f(x) = C0 + C1*x + C2*x*x
fit f(x) "data.txt" using 1:2 via C0, C1, C2
The 1:2 there means first column for values of x, and second column for values of f(x). It iterates for a while, then describes the fit it generated. The fit is not the last lines, it is actually a bit before under the "Final set of parameters" heading, but it also sets the fit constants. So, just run
print C0, C1, C2
and it will tell you
92.4999999999093 0.120000000003356 0.00799999999997438
i.e. the fit function is f(x) = 92.5 + 0.12*x + 0.008*x*x.
With gnuplot, we can now compare our fit to the points by plotting it as a graph:
set xrange [ 0 : 125 ]; plot f(x) notitle with lines lc "red", "data.txt" using 1:2 notitle with points lc "black"
Similarly, we can print the fit function values in the four interesting points via
print f(25), f(50), f(75), f(100)
which tells us the corresponding values (100.5, 118.5, 146.5, 184.5, so not exactly right, but not that much off either).
We could also just try a linear fit, via
f(x) = C0 + C1*x ; fit f(x) "data.txt" u 1:2 via C0, C1
which is not too bad either.
If you want more control over fitting, say adding importance/weighing to the data points you have (and thus exceed basic "least squares" fitting with your own "distance" and "close enough" definitions), you can use Octave (https://octave.org/)/Matlab (Octave being pretty compatible with Matlab but free and open source), or if you are familiar with Python, use numpy[/tt]/[URL=https://scipy.org/]scipy (https://numpy.org/) for numerical fitting (and other stuff, including Fourier transforms and such). Statistics people tend to use R programming language (https://en.wikipedia.org/wiki/R_(programming_language)), which is also free and open source, where fitting curves slots under the heading regression analysis, often linear regression (typical example is seeing a point cloud, and the linear regression line through it, showing the dependence between the point x and y coordinates).
For microcontroller stuff, the above is typically the starting point and not the final word, because they (especially Maxima) assume infinite-precision variables. So, I personally often end up just finding the approximate values for the coefficients C0, C1, and so on, and write a brute-force (possibly aided by gradient descent at highest supported precision and range, typically long double on my machine) C program to find the exact values in the actual format I will use –– be that fixed point or float/IEEE 754 binary32 or double/IEEE 754 binary64 ––, that optimizes the approximate coefficients for the used numeric format. This is often seen as silly or going too far, but writing the program is quick, and I can usually leave the computer to chug through the solution space overnight, so I don't see the silliness. You might wish to check out the 4th order polynomial coefficients for pressure at temperature (https://www.eevblog.com/forum/projects/4th-order-polynomial-coefficients-for-pressure-at-temperature-readings-help!!!/msg3194850/#msg3194850) thread from a few years back for what that kind of optimization can be useful for, compared to just curve fitting.
-
With chebyshev coefficients you can fit a function optimally, for minimize absolute error.
https://en.m.wikipedia.org/wiki/Chebyshev_nodes
https://en.m.wikipedia.org/wiki/Chebyshev_polynomials
-
Yes, polynomials are one approach. Their utility may depend on the math functions your MCU has.
If Maxim is a good role model, it uses a look-up table (LUT) for its MAX31856 and similar chips for thermocouples. AMS also uses a LUT for its lightning detector ("Franklin") chip. I'm just an aged hobbyist and am not embarrassed to use a LUT to convert BCD mm to binary inches. It's hard to beat a LUT for speed, and in my experience, if not overdone, they often take up less code space in the end.
-
Thank you!
I at least know what i need to start reading about, which was the intent of my post.
-
Linear look-up and piecewise-defined functions via look-up tables can be both extremely fast and produce really good results for y=f(x).
The approach I like most is one where you select x so that they are densest where f(x) changes behaviour, with each pair sorted by x. Then, a binary search can find the piece containing the x you want to find f(x) for very, very fast, and you need fewer of them than with a regular x spacing.
Linear interpolation is the easiest. If you have \$x_0 < x < x_1\$, and \$y_0 = f(x_0)\$, \$y_1 = f(x_1)\$, then
$$y = f(x) = \frac{(x_1 - x) y_0 + (x - x_0) y_1}{x_1 - x_0}$$
This is numerically stable, and even with limited-precision numbers, will approach \$y_0\$ when \$x\$ approachex \$x_0\$, and \$y_1\$ when \$x\$ approaches \$x_1\$. When using integer \$x\$ and \$y\$, that is the formula you'll want to use, even though it does involve a division.
You do not want to use \$f(x) = y_0 + (x - x_0) C\$ or \$f(x) = A + x B\$, where precomputed \$C = (y_1 - y_0) / (x_1 - x_0)\$ or \$A = (x_1 y_1 - x_0 y_0)/(x_1 - x_0)\$ and \$B = (y_0 - y_1) / (x_1 - x_0)\$, because these are not numerically stable. Finite precision will cause a discontinuous jump when \$x\$ reaches \$x_1\$. It is much better to use
$$y = f(x) = (x_1 - x) a + (x - x_0) b, \quad a = \frac{y_0}{x_1 - x_0}, \quad b = \frac{y_1}{x_1 - x_0}$$
saving the value of \$a\$ and \$b\$ for each \$x\$. Note that you don't then need to save the values of \$y\$ at all, so it is just two constants for each value of \$x\$.
For cubic numerically stable interpolation, use
$$y = f(x) = (x - x_0)^3 C_0 + (x - x_0)^2 (x_1 - x) C_1 + (x - x_0) (x_1 - x)^2 C_2 + (x_1 - x)^3 C_3$$
where
$$C_0 = \frac{y_1}{(x_1 - x_0)^3}, \quad C_3 = \frac{y_0}{(x_1 - x_0)^3}, \quad C_1 = 3 C_0 - \frac{Y_1}{(x_1 - x_0)^2}, \quad C_2 = 3 C_3 + \frac{Y_0}{(x_1 - x_0)^2}$$
and \$Y_0\$ is the slope of \$f(x)\$ at \$x = x_0\$, and \$Y_1\$ is the slope of \$f(x)\$ at \$x = x_1\$. Here, for each \$x\$, you save \$C_0\$, \$C_1\$, \$C_2\$, and \$C_3\$. That is, if you have \$f(x)\$, and \$df(x)/dx\$ for a set of \$x\$, you can use cubic interpolation this way in a numerically stable fashion, saving just four constants \$C_0\$, \$C_1\$, \$C_2\$, and \$C_3\$ for each interval in \$x\$.
In your code, you look up the four constants (via a binary search, finding the largest \$x_0\$ smaller than your \$x\$, the next one being \$x_1\$, then calculate \$a_1 = x - x_0\$, \$a_2 = a_1^2\$, \$b_1 = x_1 - x\$, \$b_2 = b_1^2\$, and finally \$y = f(x) = a_1 a_2 C_0 + a_1 b_1 (a_1 C_1 + b_1 C_2) + C_3 b_1 b_2\$, for a total of 10 multiplications and 5 additions or subtractions.
-
I'm trying to wrap my head around the math around calibration tables, but don't know where to start. Google and ChatGPT aren't really a help probably due to my search/prompt.
For example if characterized a temperature sensor (it could be any sort of measured values). So i have a list of input and output values, but they are not linear. My real world example is measuring RF power which is a whole other rabbit hole so lets stick to a temperature sensor.
I'm assuming that when you have a calibration table you are trying to fit them into a linear line and not some polynomial function. Then code this into some sort of micro controller etc
I may be totally off here, but i think that is what i want to do.
I'm looking for a good reference book, video, web etc and/or what is this called in mathematics. A generic code example would also be great as well.
It's called approximation by interpolation, or just interpolation.
You have a function which maps input to output (sensor temperature to sensor resistance, say). You know the function at certain values (measurement points).
You want to estimate the function at values in between the measurement points (you want to "interpolate" the points).
If you want to purse the mathematical side, this takes you into fields called Approximation Theory, Optimal Recovery, and Functional Analysis.
I suggest you don't start there. It gets deep quickly.
One way to get started: pick up an introductory book on Numerical Analysis, and look for a chapter on Interpolation. Or look on-line for similar
material e.g. https://en.wikipedia.org/wiki/Interpolation
Another way to get started: look for discussions of sensor linearization using look-up tables e.g.
https://electronics.stackexchange.com/questions/412702/linearizing-a-0-10v-non-linear-sensor
-
I'm trying to wrap my head around the math around calibration tables, but don't know where to start. Google and ChatGPT aren't really a help probably due to my search/prompt.
For example if characterized a temperature sensor (it could be any sort of measured values). So i have a list of input and output values, but they are not linear. My real world example is measuring RF power which is a whole other rabbit hole so lets stick to a temperature sensor.
I'm assuming that when you have a calibration table you are trying to fit them into a linear line and not some polynomial function. Then code this into some sort of micro controller etc
I may be totally off here, but i think that is what i want to do.
I'm looking for a good reference book, video, web etc and/or what is this called in mathematics. A generic code example would also be great as well.
Hi there,
As you know by now there are a ton of methods for doing this. Some are more stable than others. For example, a straight up polynomial curve fit does not always work out very good because there is no way to control the overshoots and undershoots and areas between data points. To get this to work better you can introduce derivatives into that mix and get more control over how the fit works between data points.
The more general method is the Least Squares method which has been mentioned in this thread already. A more advance version would be the Levenberg-Marquardt algorithm, but there are really a lot of choices, and there are some tricks you can use if you take a little more time to study the data.
Since this will be done with a microcontroller and you probably don't want it to take too long to compute each result, a common method is to just use linear interpretation between points in a table in which you store your measured and verified data. I'll give a brief introduction.
Let's say you have inputs at x=1, 2, and 3, and corresponding outputs of 10, 20, and 30. This would probably be a perfectly linear set but it doesn't have to be, so we will for now assume it is not really linear. We will still use linear interpolation though as an example.
The thing with that data set is that we do not know what the output is for an input of say 1.2, 2.5, etc., we just know those three points.
To solve for the result with x=2.5 we know that x=3 minus x=2 is 1, and 1/2 of 1 is 0.5. That's our scaling factor. Since x=2 and x=3 corresponds to outputs of 20 and 30, we know the result lies between those two values. Since the difference between those two is 10 and the scaling factor is 0.5, we can multiply 10 times 0.5 and we get 5. Now we add that to 20 (the lower of 2 and 3) and we get 25. That is the linear interpolation.
Now to get the solution for x=1.2 we do the same thing. 2 minus 1 is 1, and 1 times 0.2 is 0.2 which is the scaling factor now. The output is between 10 and 20. and the difference is 10, and 10 times 0.2 is 2, and 10+2 equals 12, so the output is 12. It's that simple and it's fast that is why it is used with the slower microcontrollers.
You just have to make sure you have enough data points so that you do not get large errors using this simple method.
There are some tricks you can use though. With the example of the thermistor, if you take the natural log of each output first you can get a better fit. That would mean storing values that are the natural log of the true values. When you go to interpolate, you would take the constant 'e' up to the power of the table value to get the resistance. If you use a fourth order polynomial combined with this method, you can get an accuracy of 0.1 percent over the entire temperature range of a 1 percent tolerance thermistor when compared to measured data.
You can also look up the formula for a thermistor and incorporate that.
The most common method though is to use a table and use linear interpolation. If you do not have room to store a table though, you would have to resort to the Least Squares method, or the other one mentioned above. The tradeoff is the time spent calculating the results.
There are also very intense algorithms out there for curve fitting that look at deviations and trends and other stuff in order to determine certain things about the fit and also about the formula being used for the fit, such as identifying outliers and redundant variables. This gets very interesting.
-
Wouldn't a simple sin(x)/x interpolation like the oscilloscopes do be simple and easy? If it's good enough for oscilloscopes it should be good enough for interpolating calibration points, no?
-
Wouldn't a simple sin(x)/x interpolation like the oscilloscopes do be simple and easy?
They exploit uniform sampling intervals; they don't actually calculate each point separately, but convert a sequence of regularly spaced samples \$x_i\$ to a much denser sequence of regularly spaced samples \$y_i\$. It is pretty slow to calculate for a single arbitrary value, compared to the piecewise methods.
Cubic interpolation produces surprisinly good results, even for audio use.
Bézier curves, used in CAD and SVG and PostScript and PDF and basically everywhere, are cubic functions in each coordinate (with a curve parameter sweeping from 0 to 1).
-
Bezier curves are parametric, which isn't the best fit for this purpose since we simply want to map x to y or y to x, rather than deal with x(t) and y(t). Polynomial and rational approximations are commonly used.
-
Bezier curves are parametric, which isn't the best fit for this purpose since we simply want to map x to y or y to x, rather than deal with x(t) and y(t).
One-dimensional Bézier curve \$x(t) = (1-t)^3 X_0 + 3 (1-t)^2 t X_1 + 3 (1-t) t^2 X_2 + t^3 X_3\$ is strictly equivalent to a cubic polynomial \$y(t) = Y_0 + Y_1 t + Y_2 t^2 + Y_3 t^3\$ via \$Y_0 = X_0\$, \$Y_1 = 3 (X_1 - X_0)\$, \$Y_2 = 3 X_2 - 6 X_1 + 3 X_0\$, and \$Y_3 = X_3 - 3 X_2 + 3 X_1 - X_0\$, except that the Bézier form (with or without the \$3\$ constant factors for the middle terms) is numerically stable; compare to my post above, when you wish to interpolate any function for which you have some points \$x_k\$, the value at those points \$y_k = f(x_k)\$, and the slope (or derivative) at those points, \$Y_k = \lvert d f(x) / d x \rvert_{x = x_k}\$.
2D motion control via Bézier curves is very interesting, because the curves are split where the derivative changes sign (so that each leftover cubic curve consists only of segments \$t = 0 \dots 1\$ where the derivatives of each coordinate do not change sign), and perfect path is described by a sequence of bits where 0 indicates change in one axis, and 1 in the other. It is then a matter of velocity control how fast this binary sequence is executed. Funky, eh? Only silly people do it by subdividing it into linear segments.
Point is, for univariate interpolation –– one dimension only, i.e. \$y = f(x)\$ –– you normally only use piecewise linear, piecewise cubic, or polynomials (or special dedicated functions) covering the entire range.
-
Many people already said helpful things about different approaches to fitting polynomials, piecewiese linear interpolation, splines etc. So all I would like to add is this: If you happen to know why and how the sensor response is non-linear, the best method is to fit the specific function that is rooted in the physical model instead of fitting a generic smoother. E.g. if you know the response is logarithmic, fit a log function, not a polynomial. But if you don't, go with splines/LOESS/polynomials/whatever.
-
There is a problem with this approach. The curve may be approximately logarithmic, but not an exact logarithm. In that case (which is most of the time) it is again necessary to use a polynomial.
Note that logarithms are also calculated with polynomials, usually by doing the transformation:
t=(x-1)/(x+1)
And then applying a polynomial:
log_10(x) = a1·t + a3·t^3 + a5·t^5 + a7·t^7 + a9·t^9
a1=0.868591718
a3=0.289335524
a5=0.177522071
a7=0.094376476
a9=0.191337714
for x in the interval [1/sqrt(10), sqrt(10)]
(From Abramowitz and Stegun)
-
Look at this code for logarithm, which uses a division of polynomials:
/*
log returns the natural logarithm of its floating
point argument.
The coefficients are #2705 from Hart & Cheney. (19.38D)
It calls frexp.
*/
#include <errno.h>
#include "cmath.h"
static const double p0 = -0.240139179559210510e2;
static const double p1 = 0.309572928215376501e2;
static const double p2 = -0.963769093368686593e1;
static const double p3 = 0.421087371217979714e0;
static const double q0 = -0.120069589779605255e2;
static const double q1 = 0.194809660700889731e2;
static const double q2 = -0.891110902798312337e1;
double log(double arg) {
double x,z, zz, temp;
int exp;
if(arg <= 0.0) {
errno = EDOM;
return(-HUGE);
}
x = frexp(arg, &exp);
if(x < INV_SQRT2) {
x *= 2;
exp--;
}
z = (x-1)/(x+1);
zz = z*z;
temp = ((p3*zz + p2)*zz + p1)*zz + p0;
temp = temp/(((1.0*zz + q2)*zz + q1)*zz + q0);
temp = temp*z + exp*LN2;
return(temp);
}
double log10(double arg) {
return(log(arg)*INV_LN10);
}
Only 7 coefficients and 18 decimal digits of precision!!!
-
Rational polynomials are very accurate with few coefficients, but the coefficients are very difficult to compute.
Non-rational polynomials calculated with Chevyshev nodes are much simpler to calculate.
For the precision normally required in practical applications (3 to 5 decimal places) no more than 3 to 5 coefficients are usually necessary.
Example of function sen(x) approximated by polynomial with 4 digits of precision:
xx = x * x;
sin = 0.00761;
sin = sin * xx - 0.16605;
sin = sin * xx + 1;
sin = sin * x;
x sin(x) p(x) relative error
0,00000 0,00000 0,00000 0,00000
0,06283 0,06279 0,06279 0,00000
0,12566 0,12533 0,12533 0,00001
0,18850 0,18738 0,18739 0,00002
0,25133 0,24869 0,24870 0,00004
0,31416 0,30902 0,30903 0,00005
0,37699 0,36812 0,36815 0,00008
0,43982 0,42578 0,42582 0,00010
0,50265 0,48175 0,48181 0,00012
0,56549 0,53583 0,53590 0,00014
0,62832 0,58779 0,58788 0,00015
0,69115 0,63742 0,63753 0,00016
0,75398 0,68455 0,68466 0,00017
0,81681 0,72897 0,72909 0,00017
0,87965 0,77051 0,77063 0,00015
0,94248 0,80902 0,80912 0,00013
1,00531 0,84433 0,84441 0,00010
1,06814 0,87631 0,87636 0,00006
1,13097 0,90483 0,90484 0,00002
1,19381 0,92978 0,92974 -0,00003
1,25664 0,95106 0,95097 -0,00009
1,31947 0,96858 0,96846 -0,00013
1,38230 0,98229 0,98213 -0,00016
1,44513 0,99211 0,99195 -0,00016
1,50796 0,99803 0,99791 -0,00012
1,57080 1,00000 1,00000 0,00000
-
Most easy and practical is to use quadratic interpolation with polynome. It requires 3 calibration points which give you 3 calibration coefficients. If you're needs to calibrate more complicated non linearity you can add more calibration points and use quadratic interpolation between each 3 points.
-
Practical viewpoint: memory is nowadays cheap. Even the cheapest microcontrollers come with tens of kilobytes of flash memory which can be used to store calibration data.
So quite often a simple LUT for every possible value works out. For example, for 14-bit ADC input values producing 16-bit calibrated outputs, 2^14 * 2 = 32768-byte table is needed, which fits in modern day microcontrollers starting from the $1-class. As microcontrollers access the memory near to CPU processing speed anyway, random accessing memory is efficient. Advantages are simplicity of implementation, performance, and perfect behavior with weird nonlinearities which might not easily modeled with curve fitting.
If such large LUT nearly fits but not quite, then adding piecewise linear interpolation to reduce amount of data by maybe 2x or 4x is a good idea, but the further you decimate the worse it gets and then it's probably a better idea to use a polynomial fit, or for weird nonlinearities, a custom ad hoc function which models the system using a few parameters.
-
Try this, will make the polinomial based on input data.
You have most interpolation methods: Lagrange, Newton, Neville...
https://www.dcode.fr/function-equation-finder (https://www.dcode.fr/function-equation-finder)
-
A practical example of the need for calibration is measure temperature with thermocouples.
From NIST:
ITS-90 Table for type T thermocouple
°C 0 1 2 3 4 5 6 7 8 9
Thermoelectric Voltage in mV
0 0.000 0.039 0.078 0.117 0.156 0.195 0.234 0.273 0.312 0.352
10 0.391 0.431 0.470 0.510 0.549 0.589 0.629 0.669 0.709 0.749
20 0.790 0.830 0.870 0.911 0.951 0.992 1.033 1.074 1.114 1.155
30 1.196 1.238 1.279 1.320 1.362 1.403 1.445 1.486 1.528 1.570
40 1.612 1.654 1.696 1.738 1.780 1.823 1.865 1.908 1.950 1.993
50 2.036 2.079 2.122 2.165 2.208 2.251 2.294 2.338 2.381 2.425
60 2.468 2.512 2.556 2.600 2.643 2.687 2.732 2.776 2.820 2.864
70 2.909 2.953 2.998 3.043 3.087 3.132 3.177 3.222 3.267 3.312
80 3.358 3.403 3.448 3.494 3.539 3.585 3.631 3.677 3.722 3.768
90 3.814 3.860 3.907 3.953 3.999 4.046 4.092 4.138 4.185 4.232
100 4.279 4.325 4.372 4.419 4.466 4.513 4.561 4.608 4.655 4.702
110 4.750 4.798 4.845 4.893 4.941 4.988 5.036 5.084 5.132 5.180
120 5.228 5.277 5.325 5.373 5.422 5.470 5.519 5.567 5.616 5.665
130 5.714 5.763 5.812 5.861 5.910 5.959 6.008 6.057 6.107 6.156
140 6.206 6.255 6.305 6.355 6.404 6.454 6.504 6.554 6.604 6.654
150 6.704 6.754 6.805 6.855 6.905 6.956 7.006 7.057 7.107 7.158
160 7.209 7.260 7.310 7.361 7.412 7.463 7.515 7.566 7.617 7.668
170 7.720 7.771 7.823 7.874 7.926 7.977 8.029 8.081 8.133 8.185
180 8.237 8.289 8.341 8.393 8.445 8.497 8.550 8.602 8.654 8.707
190 8.759 8.812 8.865 8.917 8.970 9.023 9.076 9.129 9.182 9.235
200 9.288 9.341 9.395 9.448 9.501 9.555 9.608 9.662 9.715 9.769
210 9.822 9.876 9.930 9.984 10.038 10.092 10.146 10.200 10.254 10.308
220 10.362 10.417 10.471 10.525 10.580 10.634 10.689 10.743 10.798 10.853
230 10.907 10.962 11.017 11.072 11.127 11.182 11.237 11.292 11.347 11.403
240 11.458 11.513 11.569 11.624 11.680 11.735 11.791 11.846 11.902 11.958
250 12.013 12.069 12.125 12.181 12.237 12.293 12.349 12.405 12.461 12.518
260 12.574 12.630 12.687 12.743 12.799 12.856 12.912 12.969 13.026 13.082
270 13.139 13.196 13.253 13.310 13.366 13.423 13.480 13.537 13.595 13.652
280 13.709 13.766 13.823 13.881 13.938 13.995 14.053 14.110 14.168 14.226
290 14.283 14.341 14.399 14.456 14.514 14.572 14.630 14.688 14.746 14.804
300 14.862 14.920 14.978 15.036 15.095 15.153 15.211 15.270 15.328 15.386
310 15.445 15.503 15.562 15.621 15.679 15.738 15.797 15.856 15.914 15.973
320 16.032 16.091 16.150 16.209 16.268 16.327 16.387 16.446 16.505 16.564
330 16.624 16.683 16.742 16.802 16.861 16.921 16.980 17.040 17.100 17.159
340 17.219 17.279 17.339 17.399 17.458 17.518 17.578 17.638 17.698 17.759
350 17.819 17.879 17.939 17.999 18.060 18.120 18.180 18.241 18.301 18.362
360 18.422 18.483 18.543 18.604 18.665 18.725 18.786 18.847 18.908 18.969
370 19.030 19.091 19.152 19.213 19.274 19.335 19.396 19.457 19.518 19.579
380 19.641 19.702 19.763 19.825 19.886 19.947 20.009 20.070 20.132 20.193
390 20.255 20.317 20.378 20.440 20.502 20.563 20.625 20.687 20.748 20.810
A polinomial with this coefficients:
[-1.07895005e-14 2.95331121e-11 -4.34050298e-08 4.77945039e-05 3.83895381e-02 2.24050514e-03]
fits the table.
Done with this Python program:
import numpy as np
voltages = [
0.000, 0.039, 0.078, 0.117, 0.156, 0.195, 0.234, 0.273, 0.312, 0.352,
0.391, 0.431, 0.470, 0.510, 0.549, 0.589, 0.629, 0.669, 0.709, 0.749,
0.790, 0.830, 0.870, 0.911, 0.951, 0.992, 1.033, 1.074, 1.114, 1.155,
1.196, 1.238, 1.279, 1.320, 1.362, 1.403, 1.445, 1.486, 1.528, 1.570,
1.612, 1.654, 1.696, 1.738, 1.780, 1.823, 1.865, 1.908, 1.950, 1.993,
2.036, 2.079, 2.122, 2.165, 2.208, 2.251, 2.294, 2.338, 2.381, 2.425,
2.468, 2.512, 2.556, 2.600, 2.643, 2.687, 2.732, 2.776, 2.820, 2.864,
2.909, 2.953, 2.998, 3.043, 3.087, 3.132, 3.177, 3.222, 3.267, 3.312,
3.358, 3.403, 3.448, 3.494, 3.539, 3.585, 3.631, 3.677, 3.722, 3.768,
3.814, 3.860, 3.907, 3.953, 3.999, 4.046, 4.092, 4.138, 4.185, 4.232,
4.279, 4.325, 4.372, 4.419, 4.466, 4.513, 4.561, 4.608, 4.655, 4.702,
4.750, 4.798, 4.845, 4.893, 4.941, 4.988, 5.036, 5.084, 5.132, 5.180,
5.228, 5.277, 5.325, 5.373, 5.422, 5.470, 5.519, 5.567, 5.616, 5.665,
5.714, 5.763, 5.812, 5.861, 5.910, 5.959, 6.008, 6.057, 6.107, 6.156,
6.206, 6.255, 6.305, 6.355, 6.404, 6.454, 6.504, 6.554, 6.604, 6.654,
6.704, 6.754, 6.805, 6.855, 6.905, 6.956, 7.006, 7.057, 7.107, 7.158,
7.209, 7.260, 7.310, 7.361, 7.412, 7.463, 7.515, 7.566, 7.617, 7.668,
7.720, 7.771, 7.823, 7.874, 7.926, 7.977, 8.029, 8.081, 8.133, 8.185,
8.237, 8.289, 8.341, 8.393, 8.445, 8.497, 8.550, 8.602, 8.654, 8.707,
8.759, 8.812, 8.865, 8.917, 8.970, 9.023, 9.076, 9.129, 9.182, 9.235,
9.288, 9.341, 9.395, 9.448, 9.501, 9.555, 9.608, 9.662, 9.715, 9.769,
9.822, 9.876, 9.930, 9.984, 10.038, 10.092, 10.146, 10.200, 10.254, 10.308,
10.362, 10.417, 10.471, 10.525, 10.580, 10.634, 10.689, 10.743, 10.798, 10.853,
10.907, 10.962, 11.017, 11.072, 11.127, 11.182, 11.237, 11.292, 11.347, 11.403,
11.458, 11.513, 11.569, 11.624, 11.680, 11.735, 11.791, 11.846, 11.902, 11.958,
12.013, 12.069, 12.125, 12.181, 12.237, 12.293, 12.349, 12.405, 12.461, 12.518,
12.574, 12.630, 12.687, 12.743, 12.799, 12.856, 12.912, 12.969, 13.026, 13.082,
13.139, 13.196, 13.253, 13.310, 13.366, 13.423, 13.480, 13.537, 13.595, 13.652,
13.709, 13.766, 13.823, 13.881, 13.938, 13.995, 14.053, 14.110, 14.168, 14.226,
14.283, 14.341, 14.399, 14.456, 14.514, 14.572, 14.630, 14.688, 14.746, 14.804,
14.862, 14.920, 14.978, 15.036, 15.095, 15.153, 15.211, 15.270, 15.328, 15.386,
15.445, 15.503, 15.562, 15.621, 15.679, 15.738, 15.797, 15.856, 15.914, 15.973,
16.032, 16.091, 16.150, 16.209, 16.268, 16.327, 16.387, 16.446, 16.505, 16.564,
16.624, 16.683, 16.742, 16.802, 16.861, 16.921, 16.980, 17.040, 17.100, 17.159,
17.219, 17.279, 17.339, 17.399, 17.458, 17.518, 17.578, 17.638, 17.698, 17.759,
17.819, 17.879, 17.939, 17.999, 18.060, 18.120, 18.180, 18.241, 18.301, 18.362,
18.422, 18.483, 18.543, 18.604, 18.665, 18.725, 18.786, 18.847, 18.908, 18.969,
19.030, 19.091, 19.152, 19.213, 19.274, 19.335, 19.396, 19.457, 19.518, 19.579,
19.641, 19.702, 19.763, 19.825, 19.886, 19.947, 20.009, 20.070, 20.132, 20.193,
20.255, 20.317, 20.378, 20.440, 20.502, 20.563, 20.625, 20.687, 20.748, 20.810
]
x = np.array(list(range(0, 400)))
y = np.array(voltages[0:400])
print(x, y)
coef= np.polyfit(x, y, 5)
print(coef)
p = np.poly1d(coef)
for t in range(1, 400):
print((p(t)-voltages[t])*1000)
References:
https://srdata.nist.gov/its90/main/ (https://srdata.nist.gov/its90/main/)
https://www.analog.com/en/resources/analog-dialogue/articles/measuring-temp-using-thermocouples.html (https://www.analog.com/en/resources/analog-dialogue/articles/measuring-temp-using-thermocouples.html)
-
Here's a VERY basic linear lookup example, just in case that is useful to anyone overwhelmed by the complex formulas above.
Lookup table example (input, output)
1 = 10
20 = 500
30 = 1000
When your code has an input value and wants to calculate the output value lookup it loops through checking the lines until the next line's input value is larger or equal to the value its looking for. So, if the input value is 25, this check will become true when it checks line 2, because on the 3rd line we have 30 which is larger than 25.
So now we have located the two lookup table lines that are on each side of the value 25 that we are looking up.
eg
20 = 500
30 = 1000
Now we just need to do some linear interpolation to figure out what output value 25 will be based on the two values that we do have (20 and 30)
(Looking at the data we know this is going to be 750, because 25 is halfway between 20 and 30 and 750 is halfway between 500 and 1000.)
The math to calculate it is pretty simple by finding a percentage for the position on the input side and applying that to the output side.
30-20 = 10
25-20 = 5
5/10 = 0.5 (50%)
and on the output side
1000-500 = 500
0.5 * 500 = 250
250+500 = 750
Interpolation done.
Now this example is VERY basic and has some issues, but if you find the other examples in the thread too technical this one is easy to understand
and you can expand on it now that you can see the basics of how it works.
-
Of course, you can calculate the inverse function that gives you the temperature from the voltage delivered by the thermocouple.
coeficients =[-3.23439565e-04 2.05876957e-02 -5.93682771e-01 2.55052633e+01 2.42295885e-01]
Program:
import numpy as np
voltages = [
0.000, 0.039, 0.078, 0.117, 0.156, 0.195, 0.234, 0.273, 0.312, 0.352,
0.391, 0.431, 0.470, 0.510, 0.549, 0.589, 0.629, 0.669, 0.709, 0.749,
0.790, 0.830, 0.870, 0.911, 0.951, 0.992, 1.033, 1.074, 1.114, 1.155,
1.196, 1.238, 1.279, 1.320, 1.362, 1.403, 1.445, 1.486, 1.528, 1.570,
1.612, 1.654, 1.696, 1.738, 1.780, 1.823, 1.865, 1.908, 1.950, 1.993,
2.036, 2.079, 2.122, 2.165, 2.208, 2.251, 2.294, 2.338, 2.381, 2.425,
2.468, 2.512, 2.556, 2.600, 2.643, 2.687, 2.732, 2.776, 2.820, 2.864,
2.909, 2.953, 2.998, 3.043, 3.087, 3.132, 3.177, 3.222, 3.267, 3.312,
3.358, 3.403, 3.448, 3.494, 3.539, 3.585, 3.631, 3.677, 3.722, 3.768,
3.814, 3.860, 3.907, 3.953, 3.999, 4.046, 4.092, 4.138, 4.185, 4.232,
4.279, 4.325, 4.372, 4.419, 4.466, 4.513, 4.561, 4.608, 4.655, 4.702,
4.750, 4.798, 4.845, 4.893, 4.941, 4.988, 5.036, 5.084, 5.132, 5.180,
5.228, 5.277, 5.325, 5.373, 5.422, 5.470, 5.519, 5.567, 5.616, 5.665,
5.714, 5.763, 5.812, 5.861, 5.910, 5.959, 6.008, 6.057, 6.107, 6.156,
6.206, 6.255, 6.305, 6.355, 6.404, 6.454, 6.504, 6.554, 6.604, 6.654,
6.704, 6.754, 6.805, 6.855, 6.905, 6.956, 7.006, 7.057, 7.107, 7.158,
7.209, 7.260, 7.310, 7.361, 7.412, 7.463, 7.515, 7.566, 7.617, 7.668,
7.720, 7.771, 7.823, 7.874, 7.926, 7.977, 8.029, 8.081, 8.133, 8.185,
8.237, 8.289, 8.341, 8.393, 8.445, 8.497, 8.550, 8.602, 8.654, 8.707,
8.759, 8.812, 8.865, 8.917, 8.970, 9.023, 9.076, 9.129, 9.182, 9.235,
9.288, 9.341, 9.395, 9.448, 9.501, 9.555, 9.608, 9.662, 9.715, 9.769,
9.822, 9.876, 9.930, 9.984, 10.038, 10.092, 10.146, 10.200, 10.254, 10.308,
10.362, 10.417, 10.471, 10.525, 10.580, 10.634, 10.689, 10.743, 10.798, 10.853,
10.907, 10.962, 11.017, 11.072, 11.127, 11.182, 11.237, 11.292, 11.347, 11.403,
11.458, 11.513, 11.569, 11.624, 11.680, 11.735, 11.791, 11.846, 11.902, 11.958,
12.013, 12.069, 12.125, 12.181, 12.237, 12.293, 12.349, 12.405, 12.461, 12.518,
12.574, 12.630, 12.687, 12.743, 12.799, 12.856, 12.912, 12.969, 13.026, 13.082,
13.139, 13.196, 13.253, 13.310, 13.366, 13.423, 13.480, 13.537, 13.595, 13.652,
13.709, 13.766, 13.823, 13.881, 13.938, 13.995, 14.053, 14.110, 14.168, 14.226,
14.283, 14.341, 14.399, 14.456, 14.514, 14.572, 14.630, 14.688, 14.746, 14.804,
14.862, 14.920, 14.978, 15.036, 15.095, 15.153, 15.211, 15.270, 15.328, 15.386,
15.445, 15.503, 15.562, 15.621, 15.679, 15.738, 15.797, 15.856, 15.914, 15.973,
16.032, 16.091, 16.150, 16.209, 16.268, 16.327, 16.387, 16.446, 16.505, 16.564,
16.624, 16.683, 16.742, 16.802, 16.861, 16.921, 16.980, 17.040, 17.100, 17.159,
17.219, 17.279, 17.339, 17.399, 17.458, 17.518, 17.578, 17.638, 17.698, 17.759,
17.819, 17.879, 17.939, 17.999, 18.060, 18.120, 18.180, 18.241, 18.301, 18.362,
18.422, 18.483, 18.543, 18.604, 18.665, 18.725, 18.786, 18.847, 18.908, 18.969,
19.030, 19.091, 19.152, 19.213, 19.274, 19.335, 19.396, 19.457, 19.518, 19.579,
19.641, 19.702, 19.763, 19.825, 19.886, 19.947, 20.009, 20.070, 20.132, 20.193,
20.255, 20.317, 20.378, 20.440, 20.502, 20.563, 20.625, 20.687, 20.748, 20.810
]
x = np.array(list(range(0, 400)))
y = np.array(voltages[0:400])
coef = np.polyfit(y, x, 4)
print(coef)
p = np.poly1d(coef)
for t in range(1, 400):
print(f'temp: {t:3d}ºC error:{p(voltages[t])-t:1.2f}ºC')
Result:
[-3.23439565e-04 2.05876957e-02 -5.93682771e-01 2.55052633e+01
2.42295885e-01]
temp: 1ºC error:0.24ºC
temp: 2ºC error:0.23ºC
temp: 3ºC error:0.22ºC
temp: 4ºC error:0.21ºC
temp: 5ºC error:0.19ºC
temp: 6ºC error:0.18ºC
temp: 7ºC error:0.16ºC
temp: 8ºC error:0.14ºC
temp: 9ºC error:0.15ºC
temp: 10ºC error:0.13ºC
temp: 11ºC error:0.13ºC
temp: 12ºC error:0.10ºC
temp: 13ºC error:0.10ºC
temp: 14ºC error:0.07ºC
temp: 15ºC error:0.06ºC
temp: 16ºC error:0.06ºC
temp: 17ºC error:0.05ºC
temp: 18ºC error:0.03ºC
temp: 19ºC error:0.02ºC
temp: 20ºC error:0.03ºC
temp: 21ºC error:0.01ºC
temp: 22ºC error:-0.00ºC
temp: 23ºC error:0.00ºC
temp: 24ºC error:-0.02ºC
temp: 25ºC error:-0.02ºC
temp: 26ºC error:-0.02ºC
temp: 27ºC error:-0.02ºC
temp: 28ºC error:-0.05ºC
temp: 29ºC error:-0.06ºC
temp: 30ºC error:-0.07ºC
temp: 31ºC error:-0.05ºC
temp: 32ºC error:-0.07ºC
temp: 33ºC error:-0.08ºC
temp: 34ºC error:-0.07ºC
temp: 35ºC error:-0.09ºC
temp: 36ºC error:-0.08ºC
temp: 37ºC error:-0.10ºC
temp: 38ºC error:-0.10ºC
temp: 39ºC error:-0.10ºC
temp: 40ºC error:-0.10ºC
temp: 41ºC error:-0.11ºC
temp: 42ºC error:-0.11ºC
temp: 43ºC error:-0.12ºC
temp: 44ºC error:-0.13ºC
temp: 45ºC error:-0.11ºC
temp: 46ºC error:-0.13ºC
temp: 47ºC error:-0.12ºC
temp: 48ºC error:-0.13ºC
temp: 49ºC error:-0.13ºC
temp: 50ºC error:-0.12ºC
temp: 51ºC error:-0.12ºC
temp: 52ºC error:-0.12ºC
temp: 53ºC error:-0.12ºC
temp: 54ºC error:-0.12ºC
temp: 55ºC error:-0.13ºC
temp: 56ºC error:-0.13ºC
temp: 57ºC error:-0.12ºC
temp: 58ºC error:-0.13ºC
temp: 59ºC error:-0.12ºC
temp: 60ºC error:-0.13ºC
temp: 61ºC error:-0.12ºC
temp: 62ºC error:-0.11ºC
temp: 63ºC error:-0.11ºC
temp: 64ºC error:-0.13ºC
temp: 65ºC error:-0.13ºC
temp: 66ºC error:-0.11ºC
temp: 67ºC error:-0.11ºC
temp: 68ºC error:-0.11ºC
temp: 69ºC error:-0.12ºC
temp: 70ºC error:-0.10ºC
temp: 71ºC error:-0.11ºC
temp: 72ºC error:-0.10ºC
temp: 73ºC error:-0.09ºC
temp: 74ºC error:-0.10ºC
temp: 75ºC error:-0.10ºC
temp: 76ºC error:-0.09ºC
temp: 77ºC error:-0.09ºC
temp: 78ºC error:-0.09ºC
temp: 79ºC error:-0.09ºC
temp: 80ºC error:-0.07ºC
temp: 81ºC error:-0.07ºC
temp: 82ºC error:-0.08ºC
temp: 83ºC error:-0.06ºC
temp: 84ºC error:-0.07ºC
temp: 85ºC error:-0.06ºC
temp: 86ºC error:-0.05ºC
temp: 87ºC error:-0.04ºC
temp: 88ºC error:-0.05ºC
temp: 89ºC error:-0.05ºC
temp: 90ºC error:-0.04ºC
temp: 91ºC error:-0.04ºC
temp: 92ºC error:-0.02ºC
temp: 93ºC error:-0.02ºC
temp: 94ºC error:-0.02ºC
temp: 95ºC error:-0.01ºC
temp: 96ºC error:-0.01ºC
temp: 97ºC error:-0.02ºC
temp: 98ºC error:-0.01ºC
temp: 99ºC error:0.00ºC
temp: 100ºC error:0.01ºC
temp: 101ºC error:-0.00ºC
temp: 102ºC error:0.01ºC
temp: 103ºC error:0.01ºC
temp: 104ºC error:0.01ºC
temp: 105ºC error:0.01ºC
temp: 106ºC error:0.04ºC
temp: 107ºC error:0.03ºC
temp: 108ºC error:0.03ºC
temp: 109ºC error:0.02ºC
temp: 110ºC error:0.04ºC
temp: 111ºC error:0.05ºC
temp: 112ºC error:0.04ºC
temp: 113ºC error:0.05ºC
temp: 114ºC error:0.06ºC
temp: 115ºC error:0.05ºC
temp: 116ºC error:0.05ºC
temp: 117ºC error:0.06ºC
temp: 118ºC error:0.06ºC
temp: 119ºC error:0.06ºC
temp: 120ºC error:0.06ºC
temp: 121ºC error:0.08ºC
temp: 122ºC error:0.07ºC
temp: 123ºC error:0.07ºC
temp: 124ºC error:0.08ºC
temp: 125ºC error:0.07ºC
temp: 126ºC error:0.08ºC
temp: 127ºC error:0.07ºC
temp: 128ºC error:0.08ºC
temp: 129ºC error:0.09ºC
temp: 130ºC error:0.09ºC
temp: 131ºC error:0.10ºC
temp: 132ºC error:0.10ºC
temp: 133ºC error:0.10ºC
temp: 134ºC error:0.10ºC
temp: 135ºC error:0.10ºC
temp: 136ºC error:0.09ºC
temp: 137ºC error:0.09ºC
temp: 138ºC error:0.10ºC
temp: 139ºC error:0.09ºC
temp: 140ºC error:0.10ºC
temp: 141ºC error:0.09ºC
temp: 142ºC error:0.10ºC
temp: 143ºC error:0.11ºC
temp: 144ºC error:0.09ºC
temp: 145ºC error:0.10ºC
temp: 146ºC error:0.10ºC
temp: 147ºC error:0.10ºC
temp: 148ºC error:0.10ºC
temp: 149ºC error:0.10ºC
temp: 150ºC error:0.10ºC
temp: 151ºC error:0.09ºC
temp: 152ºC error:0.11ºC
temp: 153ºC error:0.10ºC
temp: 154ºC error:0.09ºC
temp: 155ºC error:0.10ºC
temp: 156ºC error:0.09ºC
temp: 157ºC error:0.10ºC
temp: 158ºC error:0.09ºC
temp: 159ºC error:0.09ºC
temp: 160ºC error:0.10ºC
temp: 161ºC error:0.10ºC
temp: 162ºC error:0.08ºC
temp: 163ºC error:0.08ºC
temp: 164ºC error:0.08ºC
temp: 165ºC error:0.08ºC
temp: 166ºC error:0.09ºC
temp: 167ºC error:0.09ºC
temp: 168ºC error:0.08ºC
temp: 169ºC error:0.07ºC
temp: 170ºC error:0.08ºC
temp: 171ºC error:0.07ºC
temp: 172ºC error:0.08ºC
temp: 173ºC error:0.07ºC
temp: 174ºC error:0.08ºC
temp: 175ºC error:0.06ºC
temp: 176ºC error:0.06ºC
temp: 177ºC error:0.07ºC
temp: 178ºC error:0.07ºC
temp: 179ºC error:0.07ºC
temp: 180ºC error:0.07ºC
temp: 181ºC error:0.06ºC
temp: 182ºC error:0.06ºC
temp: 183ºC error:0.05ºC
temp: 184ºC error:0.05ºC
temp: 185ºC error:0.04ºC
temp: 186ºC error:0.05ºC
temp: 187ºC error:0.04ºC
temp: 188ºC error:0.03ºC
temp: 189ºC error:0.04ºC
temp: 190ºC error:0.03ºC
temp: 191ºC error:0.03ºC
temp: 192ºC error:0.04ºC
temp: 193ºC error:0.02ºC
temp: 194ºC error:0.02ºC
temp: 195ºC error:0.02ºC
temp: 196ºC error:0.02ºC
temp: 197ºC error:0.02ºC
temp: 198ºC error:0.02ºC
temp: 199ºC error:0.01ºC
temp: 200ºC error:0.01ºC
temp: 201ºC error:0.00ºC
temp: 202ºC error:0.01ºC
temp: 203ºC error:0.01ºC
temp: 204ºC error:-0.00ºC
temp: 205ºC error:0.01ºC
temp: 206ºC error:-0.00ºC
temp: 207ºC error:0.00ºC
temp: 208ºC error:-0.01ºC
temp: 209ºC error:-0.01ºC
temp: 210ºC error:-0.02ºC
temp: 211ºC error:-0.02ºC
temp: 212ºC error:-0.02ºC
temp: 213ºC error:-0.02ºC
temp: 214ºC error:-0.02ºC
temp: 215ºC error:-0.02ºC
temp: 216ºC error:-0.02ºC
temp: 217ºC error:-0.02ºC
temp: 218ºC error:-0.03ºC
temp: 219ºC error:-0.03ºC
temp: 220ºC error:-0.04ºC
temp: 221ºC error:-0.03ºC
temp: 222ºC error:-0.04ºC
temp: 223ºC error:-0.05ºC
temp: 224ºC error:-0.04ºC
temp: 225ºC error:-0.05ºC
temp: 226ºC error:-0.04ºC
temp: 227ºC error:-0.05ºC
temp: 228ºC error:-0.05ºC
temp: 229ºC error:-0.05ºC
temp: 230ºC error:-0.06ºC
temp: 231ºC error:-0.06ºC
temp: 232ºC error:-0.06ºC
temp: 233ºC error:-0.06ºC
temp: 234ºC error:-0.06ºC
temp: 235ºC error:-0.06ºC
temp: 236ºC error:-0.06ºC
temp: 237ºC error:-0.07ºC
temp: 238ºC error:-0.07ºC
temp: 239ºC error:-0.06ºC
temp: 240ºC error:-0.07ºC
temp: 241ºC error:-0.07ºC
temp: 242ºC error:-0.06ºC
temp: 243ºC error:-0.07ºC
temp: 244ºC error:-0.06ºC
temp: 245ºC error:-0.07ºC
temp: 246ºC error:-0.07ºC
temp: 247ºC error:-0.08ºC
temp: 248ºC error:-0.07ºC
temp: 249ºC error:-0.07ºC
temp: 250ºC error:-0.08ºC
temp: 251ºC error:-0.08ºC
temp: 252ºC error:-0.08ºC
temp: 253ºC error:-0.08ºC
temp: 254ºC error:-0.08ºC
temp: 255ºC error:-0.08ºC
temp: 256ºC error:-0.08ºC
temp: 257ºC error:-0.08ºC
temp: 258ºC error:-0.08ºC
temp: 259ºC error:-0.07ºC
temp: 260ºC error:-0.08ºC
temp: 261ºC error:-0.08ºC
temp: 262ºC error:-0.07ºC
temp: 263ºC error:-0.08ºC
temp: 264ºC error:-0.08ºC
temp: 265ºC error:-0.07ºC
temp: 266ºC error:-0.08ºC
temp: 267ºC error:-0.08ºC
temp: 268ºC error:-0.07ºC
temp: 269ºC error:-0.08ºC
temp: 270ºC error:-0.08ºC
temp: 271ºC error:-0.07ºC
temp: 272ºC error:-0.07ºC
temp: 273ºC error:-0.06ºC
temp: 274ºC error:-0.08ºC
temp: 275ºC error:-0.08ºC
temp: 276ºC error:-0.08ºC
temp: 277ºC error:-0.08ºC
temp: 278ºC error:-0.06ºC
temp: 279ºC error:-0.06ºC
temp: 280ºC error:-0.06ºC
temp: 281ºC error:-0.06ºC
temp: 282ºC error:-0.07ºC
temp: 283ºC error:-0.06ºC
temp: 284ºC error:-0.06ºC
temp: 285ºC error:-0.07ºC
temp: 286ºC error:-0.05ºC
temp: 287ºC error:-0.06ºC
temp: 288ºC error:-0.05ºC
temp: 289ºC error:-0.04ºC
temp: 290ºC error:-0.05ºC
temp: 291ºC error:-0.04ºC
temp: 292ºC error:-0.04ºC
temp: 293ºC error:-0.05ºC
temp: 294ºC error:-0.04ºC
temp: 295ºC error:-0.04ºC
temp: 296ºC error:-0.04ºC
temp: 297ºC error:-0.03ºC
temp: 298ºC error:-0.03ºC
temp: 299ºC error:-0.03ºC
temp: 300ºC error:-0.03ºC
temp: 301ºC error:-0.03ºC
temp: 302ºC error:-0.03ºC
temp: 303ºC error:-0.03ºC
temp: 304ºC error:-0.01ºC
temp: 305ºC error:-0.02ºC
temp: 306ºC error:-0.02ºC
temp: 307ºC error:-0.00ºC
temp: 308ºC error:-0.01ºC
temp: 309ºC error:-0.01ºC
temp: 310ºC error:-0.00ºC
temp: 311ºC error:-0.01ºC
temp: 312ºC error:-0.00ºC
temp: 313ºC error:0.01ºC
temp: 314ºC error:0.00ºC
temp: 315ºC error:0.01ºC
temp: 316ºC error:0.01ºC
temp: 317ºC error:0.02ºC
temp: 318ºC error:0.01ºC
temp: 319ºC error:0.01ºC
temp: 320ºC error:0.02ºC
temp: 321ºC error:0.02ºC
temp: 322ºC error:0.02ºC
temp: 323ºC error:0.03ºC
temp: 324ºC error:0.03ºC
temp: 325ºC error:0.03ºC
temp: 326ºC error:0.05ºC
temp: 327ºC error:0.04ºC
temp: 328ºC error:0.04ºC
temp: 329ºC error:0.04ºC
temp: 330ºC error:0.05ºC
temp: 331ºC error:0.05ºC
temp: 332ºC error:0.05ºC
temp: 333ºC error:0.06ºC
temp: 334ºC error:0.05ºC
temp: 335ºC error:0.06ºC
temp: 336ºC error:0.05ºC
temp: 337ºC error:0.06ºC
temp: 338ºC error:0.07ºC
temp: 339ºC error:0.06ºC
temp: 340ºC error:0.07ºC
temp: 341ºC error:0.07ºC
temp: 342ºC error:0.08ºC
temp: 343ºC error:0.08ºC
temp: 344ºC error:0.07ºC
temp: 345ºC error:0.07ºC
temp: 346ºC error:0.07ºC
temp: 347ºC error:0.07ºC
temp: 348ºC error:0.08ºC
temp: 349ºC error:0.09ºC
temp: 350ºC error:0.09ºC
temp: 351ºC error:0.09ºC
temp: 352ºC error:0.08ºC
temp: 353ºC error:0.08ºC
temp: 354ºC error:0.09ºC
temp: 355ºC error:0.09ºC
temp: 356ºC error:0.08ºC
temp: 357ºC error:0.09ºC
temp: 358ºC error:0.08ºC
temp: 359ºC error:0.09ºC
temp: 360ºC error:0.08ºC
temp: 361ºC error:0.09ºC
temp: 362ºC error:0.08ºC
temp: 363ºC error:0.08ºC
temp: 364ºC error:0.09ºC
temp: 365ºC error:0.07ºC
temp: 366ºC error:0.07ºC
temp: 367ºC error:0.08ºC
temp: 368ºC error:0.08ºC
temp: 369ºC error:0.08ºC
temp: 370ºC error:0.07ºC
temp: 371ºC error:0.07ºC
temp: 372ºC error:0.07ºC
temp: 373ºC error:0.06ºC
temp: 374ºC error:0.06ºC
temp: 375ºC error:0.05ºC
temp: 376ºC error:0.05ºC
temp: 377ºC error:0.04ºC
temp: 378ºC error:0.03ºC
temp: 379ºC error:0.02ºC
temp: 380ºC error:0.02ºC
temp: 381ºC error:0.01ºC
temp: 382ºC error:-0.00ºC
temp: 383ºC error:0.00ºC
temp: 384ºC error:-0.01ºC
temp: 385ºC error:-0.03ºC
temp: 386ºC error:-0.03ºC
temp: 387ºC error:-0.05ºC
temp: 388ºC error:-0.05ºC
temp: 389ºC error:-0.07ºC
temp: 390ºC error:-0.07ºC
temp: 391ºC error:-0.08ºC
temp: 392ºC error:-0.10ºC
temp: 393ºC error:-0.11ºC
temp: 394ºC error:-0.12ºC
temp: 395ºC error:-0.15ºC
temp: 396ºC error:-0.16ºC
temp: 397ºC error:-0.17ºC
temp: 398ºC error:-0.20ºC
temp: 399ºC error:-0.21ºC
-
There is a problem with this approach. The curve may be approximately logarithmic, but not an exact logarithm. In that case (which is most of the time) it is again necessary to use a polynomial.
Note that logarithms are also calculated with polynomials, usually by doing the transformation:
t=(x-1)/(x+1)
And then applying a polynomial:
log_10(x) = a1·t + a3·t^3 + a5·t^5 + a7·t^7 + a9·t^9
a1=0.868591718
a3=0.289335524
a5=0.177522071
a7=0.094376476
a9=0.191337714
for x in the interval [1/sqrt(10), sqrt(10)]
(From Abramowitz and Stegun)
Hi,
We do not use log's alone, we use log's and a polynomial fit. That results in very low error.
Of course here is the formula for thermistors too which may be even simpler if the thermistor is a good one.
-
Many people already said helpful things about different approaches to fitting polynomials, piecewiese linear interpolation, splines etc. So all I would like to add is this: If you happen to know why and how the sensor response is non-linear, the best method is to fit the specific function that is rooted in the physical model instead of fitting a generic smoother. E.g. if you know the response is logarithmic, fit a log function, not a polynomial. But if you don't, go with splines/LOESS/polynomials/whatever.
Hi,
I had mentioned this as one trick in post #12. I had done that several years ago and it results is a really good fit, within about 0.1 percent for the entire temperature range. That means that if the resistance calculated was 100 Ohms, it could really be 99.9 Ohms to 100.1 Ohms.
It requires using logarithms combined with a polynomial fit.
-
If we limit ourselves to calibration, then the issue becomes one of interpolation between calibrated values.
That is, we don't have, nor do we need, any specific algebraic description of the behaviour of the system; we only have the calibration points we can trust.
The two common approaches are linear and cubic interpolation. Here is an example implementation of a linear interpolation, using an array of CALIBRATION_POINTS pairs of input, value points:
// SPDX-License-Identifier: CC0-1.0
#include <stdlib.h> // Only for the test program part
#include <stdio.h> // Only for the test program part
#include <math.h> // Only for the test program part
#include <stdint.h>
// For the test program part
#ifndef CALIBRATION_POINTS
#define CALIBRATION_POINTS 512
#endif
// Sorted in input increasing order: calibrated_input[i] < calibrated_input[i+1]
uint16_t calibrated_input[CALIBRATION_POINTS];
int32_t calibrated_value[CALIBRATION_POINTS];
const uint32_t calibrated_limit = CALIBRATION_POINTS - 1;
int32_t calibrated(uint_fast16_t input)
{
// Range of values is limited to the calibration range.
// That is, this does *interpolation* only, not *extrapolation*.
if (calibrated_input[0] >= input)
return calibrated_value[0];
if (calibrated_input[calibrated_limit] <= input)
return calibrated_value[calibrated_limit];
uint_fast16_t imin = 0; // Minimum index, inclusive
uint_fast16_t imax = calibrated_limit; // Maximum index, exclusive
while (1) {
uint_fast16_t i = imin + (imax - imin) / 2;
if (i == imin)
break;
if (calibrated_input[i] < input) {
imin = i;
} else
if (calibrated_input[i] > input) {
imax = i;
} else {
return calibrated_value[i];
}
}
if (calibrated_input[imin] > input || calibrated_input[imin+1] < input)
printf("Error: %d < %d < %d is not true!\n", (int)calibrated_input[imin], (int)input, (int)calibrated_input[imin+1]);
// At this point, we have calibrated_input[imin] < input
// and calibrated_input[imin+1] > input.
const uint_fast16_t dmin = calibrated_input[imin+1] - input;
const uint_fast16_t dmax = input - calibrated_input[imin];
const uint_fast16_t dsum = calibrated_input[imin+1] - calibrated_input[imin];
// Note that by definition, dmin, dmax, dsum are 16-bit values,
// so the following really only uses 48-bit integers.
return (int32_t)( ( dmin * (int64_t)calibrated_value[imin]
+ dmax * (int64_t)calibrated_value[imin+1] ) / dsum );
}
/*
* Test program follows
*/
// Given x = 0..1, what should the sample value x be?
double w(double x)
{
return pow(x, 1.5);
}
// Function to use for testing
double f(double x)
{
return sqrt(x);
}
int main(void)
{
// For testing, we use fixed point integer valus with 9 integer bits and 22 fractional bits,
// with f(x) = sqrt(x), x = 0 .. 65535
printf("# input[i] value[i] f(input[i])\n");
for (uint_fast16_t i = 0; i < CALIBRATION_POINTS; i++) {
// Importance sampling: we emphasize low values
const uint_fast16_t x = round(65535.0 * w((double)i / (CALIBRATION_POINTS - 1)));
const double r = f((double)x);
const int32_t v = round(r * 4194304.0);
calibrated_input[i] = x;
calibrated_value[i] = v;
printf("%-5u %-10d %16.12f\n", (unsigned int)x, (int)v, r);
}
printf("\n");
printf("# x calibrated(x) f(x) relative_error\n");
double errmin = 0.0, errmax = 0.0;
for (uint_fast16_t x = 65535; x > 0; x--) {
uint32_t v = calibrated(x);
const double d = (double)v / 4194304.0;
const double r = sqrt((double)x);
const double err = (d - r) / r;
if (errmin > err)
errmin = err;
if (errmax < err)
errmax = err;
if (errmin == err || errmax == err)
printf("%5d %16.12f %16.12f %+.12f\n", (int)x, d, r, err);
}
return 0;
}It includes a test program that creates a synthetic calibration table for fixed-point square roots.
The calibrated() function assumes the input values are in increasing order, and uses a binary search to look up the correct index.
Each iteration halves the index range sought after, so with 2048 values, it only does 11 iterations. Even with 65535 values, it'd only do 15 iterations. As you can see, it does have one index comparison (to find out when we have the index), and a <, =, > between the input at that index and the input we are trying to resolve to a value, so it is very, very fast and efficient.
Because I wish to emphasize that regular intervals of inputs is not needed –– in many cases the binary search is faster than the division used to map input value range to a regularly spaced points, especially if the microcontroller does not have a fast hardware division operation –– I used importance sampled square root as the example function here. The arrays contain x and sqrt(x), the latter in fixed-point integer format with 22 fractional bits. The input is 16-bit, and value 32-bit, so it takes 6 bytes per calibration point. By default, the example program uses 512 calibration points (would correspond to 3k of Flash), but you can easily adjust it, just define CALIBRATION_POINTS when compiling (-DCALIBRATION_POINTS=300 for example).
The output is in two parts: the first part is the calibration table. Note how the calibration inputs are not regularly spaced!
The second part compares the looked-up result to calculated result, and whenever the relative error increases prints the output, so you can see the relative errors this gives.
As sqrt() function slope starts steep, then becomes smaller, the linearly interpolated value will always be below the actual value. (The occasional positive relative error is from rounding the floating-point value to an integer.) If you used regularly spaced input values, the relative error in the small values would become unacceptable for this kind of functions. As it is, the error is less than 0.04% for inputs above 256, but at 23 and smaller, exceed 1%.
Thus, the most important thing when using linear interpolation is to choose the calibration points so that connecting them with lines produces a good representation of the continuous values.
-
Linear interpolation is usually very poor and only gives precision results using a very large number of points.
Cubic interpolation is more difficult to handle, but in contrast returns surprisingly good accuracy compared to linear. This can be used to significantly reduce the number of points in the table.
Reference for programming algorithms:
https://en.wikipedia.org/wiki/Neville%27s_algorithm (https://en.wikipedia.org/wiki/Neville%27s_algorithm)
https://www.grad.hr/nastava/gs/prg/NumericalRecipesinC.pdf (https://www.grad.hr/nastava/gs/prg/NumericalRecipesinC.pdf)
https://apps.dtic.mil/sti/pdfs/ADA612943.pdf (https://apps.dtic.mil/sti/pdfs/ADA612943.pdf)
https://www.sci.brooklyn.cuny.edu/~mate/nml/numanal.pdf (https://www.sci.brooklyn.cuny.edu/~mate/nml/numanal.pdf)
-
some time ago I was implemented 100000 points calibration for NWT-7 device :D, it works very good to compensate very non linear curve and allows to get pretty precise measurements on a cheap device. :)
-
Of course, LUTs without interpolation are the simplest way that can be done and have the great advantage of not needing calculations and being very fast.
It is like random number generation. You can store a list of 100000 random numbers, (with the advantage of being really random) or perform a more complicated calculation to calculate pseudo-random numbers (with the advantage of taking less space).
----------------
Another way to interpolate a table is to store the coefficients for the several interpolating polynomials by table sections. You can place several relatively low degree (cubic) interpolating polynomials with coefficients already precomputed, to save you the somewhat computationally intensive (O(n2)) of Neville's algorithm.
-
Here is the cubic interpolation corresponding to the linear interpolation I showed in reply #28. It's not difficult at all, either.
// SPDX-License-Identifier: CC0-1.0
#include <stdlib.h> // Only for the test program part
#include <stdio.h> // Only for the test program part
#include <math.h> // Only for the test program part
#include <stdint.h>
// For the test program part
#ifndef CALIBRATION_POINTS
#define CALIBRATION_POINTS 128
#endif
// Sorted in input increasing order: calibrated_input[i] < calibrated_input[i+1]
float calibrated_input[CALIBRATION_POINTS];
float calibrated_coeff[CALIBRATION_POINTS - 1][4];
const uint32_t calibrated_limit = CALIBRATION_POINTS - 1;
float calibrated(const float input)
{
uint32_t i;
do {
if (calibrated_input[0] >= input) {
i = 0;
break;
}
if (calibrated_input[calibrated_limit] <= input) {
i = calibrated_limit - 1;
break;
}
// Binary search.
uint32_t imin = 0; // Inclusive
uint32_t imax = calibrated_limit; // Exclusive
while (1) {
i = imin + (imax - imin) / 2;
if (i == imin)
break;
if (calibrated_input[i] > input) {
imax = i;
} else
if (calibrated_input[i] < input) {
imin = i;
} else
break;
}
} while (0);
// We have calibrated_input[i] <= input < calibrated_input[i+1].
const float d0 = input - calibrated_input[i];
const float d1 = calibrated_input[i+1] - input;
const float d0d0 = d0 * d0;
const float d1d1 = d1 * d1;
return d0d0*(d0*calibrated_coeff[i][0] + d1*calibrated_coeff[i][1])
+ d1d1*(d0*calibrated_coeff[i][2] + d1*calibrated_coeff[i][3]);
}
/*
* Test program follows
*/
// Value and derivative used to calculate the coefficients
float calibrated_value[CALIBRATION_POINTS];
float calibrated_deriv[CALIBRATION_POINTS]; // Actual slope at the calibration point
float calibrated_slope[CALIBRATION_POINTS]; // Estimated slope at the calibration point
// Importance sampling: Given x = 0..1, what should the sample value x be?
double w(double x)
{
return x*x;
}
// Function to use for testing
double f(double x)
{
return sqrt(x);
}
// Derivative of the above function for testing
double df(double x)
{
return 0.5 / sqrt(x);
}
int main(void)
{
/*
* Mapping calibration values to coefficients
*/
// First, we obtain a set of calibration points, input sorted in increasing order.
for (uint32_t i = 0; i < CALIBRATION_POINTS; i++) {
// Importance sampling: we have denser calibration points for smaller inputs.
const double p = w((double)i / (double)(CALIBRATION_POINTS - 1));
const double x = (1.0 - p)*1.0 + p*65535.0;
calibrated_input[i] = x;
calibrated_value[i] = f(x);
calibrated_deriv[i] = df(x);
}
// If we do not know the derivative at each calibration point, we can calculate them
// based on preceding and succeeding point for each calibration point.
// Initial slope is toward the second calibration point.
calibrated_slope[0] = (calibrated_value[1] - calibrated_value[0]) / (calibrated_input[1] - calibrated_input[0]);
// Final slope is also similarly linear. Thus, extrapolation will be purely linear.
calibrated_slope[calibrated_limit] = (calibrated_value[calibrated_limit] - calibrated_value[calibrated_limit - 1])
/ (calibrated_input[calibrated_limit] - calibrated_input[calibrated_limit - 1]);
// For the others, the slope is calculated from the preceding and succeeding slopes between the calibration points,
// weighed by the input distances.
for (uint32_t i = 1; i < CALIBRATION_POINTS - 1; i++) {
const double dx0 = calibrated_input[i ] - calibrated_input[i - 1];
const double dx1 = calibrated_input[i + 1] - calibrated_input[i ];
const double dx = calibrated_input[i + 1] - calibrated_input[i - 1];
const double dy0 = calibrated_value[i ] - calibrated_value[i - 1];
const double dy1 = calibrated_value[i + 1] - calibrated_value[i ];
// dy0/dx0 is the slope between previous and current point, dy1/dx1 the slope between current and next point.
// Weight is (1 - dx0)/dx = dx1/dx for the previous slope, and (1 - dx1)/dx = dx0/dx for the next slope.
calibrated_slope[i] = (dx1/dx)*(dy0/dx0) + (dx0/dx)*(dy1/dx1);
// Advanced models can adjust slopes to control the behaviour between calibration points,
// when additional information is present.
// It is most commonly done only for segments containing a peak in value, to control the overshoot.
}
// When calibration sample points and the slopes at each point are known, we can calculate the four coefficients.
printf("# x f(x) df(x)/dx slope C0 C1 C2 C3\n");
for (uint32_t i = 0; i < CALIBRATION_POINTS - 1; i++) {
const double dx = calibrated_input[i + 1] - calibrated_input[i];
const double dx2 = dx*dx;
const double dx3 = dx*dx2;
const double c0 = calibrated_value[i + 1] / dx3;
const double c3 = calibrated_value[i] / dx3;
#ifdef USE_SLOPE
const double c1 = 3*c0 - calibrated_slope[i+1] / dx2;
const double c2 = 3*c3 + calibrated_slope[i] / dx2;
#else /* USE_DERIV */
const double c1 = 3*c0 - calibrated_deriv[i+1] / dx2;
const double c2 = 3*c3 + calibrated_deriv[i] / dx2;
#endif
calibrated_coeff[i][0] = c0;
calibrated_coeff[i][1] = c1;
calibrated_coeff[i][2] = c2;
calibrated_coeff[i][3] = c3;
printf("%18.12f %15.12f %15.12f %15.12f %15.12f %15.12f %15.12f %15.12f\n",
calibrated_input[i], calibrated_value[i],
calibrated_deriv[i], calibrated_slope[i],
calibrated_coeff[i][0], calibrated_coeff[i][1],
calibrated_coeff[i][2], calibrated_coeff[i][3]);
}
printf("\n");
printf("# x calibrated(x) f(x) absolute_error relative_error\n");
for (float x = 1.0f; x < 65536.0f; x += 1.0f) {
const float c = calibrated(x);
const double r = f(x);
printf("%18.12f %18.12f %18.12f %+.12f %+.12f\n", x, c, r, c - r, (c - r)/r);
}
return 0;
}The calibrated_input[] array consists of the CALIBRATION_INPUT input values, and calibrated_coeff[] array consists of the four coefficients of each of the CALIBRATION_INPUT-1 segments (starting at the corresponding input value). Extrapolation uses the initial and final segments.
The calibrated() function again uses a binary search to find the segment that contains the input value. It then uses the numerically stable version of the cubic interpolation using the four coefficients (and the input values defining the segment). This time, I used a slightly different structure for the function, with an outer do { ... } while (0) loop only for early break-out (when the segment index i has been discovered), but as you can see, it is very similar to the one for linear interpolation.
The test program uses the known derivative of \$\sqrt{x}\$, \$\frac{d \sqrt{x}}{d x} = \frac{1}{2 \sqrt{x}}\$, but if you define USE_SLOPE at compile time (e.g. -DUSE_SLOPE command-line parameter to gcc, clang, or icc), it estimates the slope for each calibration point using the previous and next calibration points. The four coefficients for each curve segment is as listed in reply #9. Because each sample point takes 4+16 bytes (compared to 2+4 bytes in the linear interpolation example), this time the default number of calibration points is just 128 (for 2560 bytes of Flash). Also, both input and resulting values are now floats. I could have implemented this in fixed point, but it would have looked much messier, because multiplying a*b where they have fa and fb fractional bits, and the result has fc < fa+fb bits, is (a*b)>>(fc-fa-fb), and the intermediate type has to be large enough to hold the entire product; plus correct rounding can look odd (adding 1<<(fc-fa-fb-1) (if result is positive, subtract if result is negative) before the shift is applied), so I wanted to focus on the important bits. The cubic polynomial part itself isn't onerous either way, because it is just 8 "high-precision" multiplications and five additions or subtractions.
Even though I used importance sampling here as well, relative error in the interpolated value is still rather big for small inputs. Even heavier emphasis (shorter segments at small inputs) would be useful, because the relative error becomes ridiculously small at higher values. (After all, float only has about seven to eight significant digits; a relative error of ±0.00000003 or less means the value is within half an ULP in a float, i.e. as correct as you can get with float. The actual values are computed at double precision –– I normally prefer to go as high as possible, long double or _Float128 (via libquadmath in C), but kept to the familiar double here for clarity.)
Note: I am not saying that piecewise cubic or linear approximation is the sensible way to deal with square root operation! It is much better solved using even naïve Newton-Raphson, for example, where you iterate x ← x/2 + C/x to solve x = sqrt(2*C) starting with x=C (or with the Doom trick), although many other approaches are available (https://en.wikipedia.org/wiki/Methods_of_computing_square_roots), including binary ones. I'm trying to show the look-up instead, and show how to define the cubic coefficients based either on preceding and succeeding calibration points for each point, or on the derivative of the value at each calibration point.
-
Linear interpolation is usually very poor and only gives precision results using a very large number of points.
Cubic interpolation is more difficult to handle, but in contrast returns surprisingly good accuracy compared to linear. This can be used to significantly reduce the number of points in the table.
Reference for programming algorithms:
https://en.wikipedia.org/wiki/Neville%27s_algorithm (https://en.wikipedia.org/wiki/Neville%27s_algorithm)
https://www.grad.hr/nastava/gs/prg/NumericalRecipesinC.pdf (https://www.grad.hr/nastava/gs/prg/NumericalRecipesinC.pdf)
https://apps.dtic.mil/sti/pdfs/ADA612943.pdf (https://apps.dtic.mil/sti/pdfs/ADA612943.pdf)
https://www.sci.brooklyn.cuny.edu/~mate/nml/numanal.pdf (https://www.sci.brooklyn.cuny.edu/~mate/nml/numanal.pdf)
Hi there,
Not sure where you got that idea from that linear interpolation usually gives poor results. It depends highly on the curve type. With some curves it is going to give very good results.
You can measure the efficiency of the technique though for any curve by comparing the interpolated results for many samples with a given finite table to a table with an increased number of samples stored in it. In one case, using a table of 100 values and linear interpolation resulted in matching the results of a table of 45000 entries and no interpolation. That's a 450 to 1 savings in memory. Of course without question a cubic interpolation could work even better maybe pumping that up even to 4500 to 1, but then the run time calculations take longer.
-
Bivariate interpolation, where you have a function \$z = f(x,y)\$ and some calibrated points \$(x, y, z)\$ can be divided into two major classes: One has regular grid, and the other has random points, in \$x, y\$.
The regular rectangular grid one is easiest, and both bilinear and bicubic interpolation is used. For bilinear, you find the cell containing the point; the four points then have coordinates \$(x_0, y_0, z_{00})\$, \$(x_1, y_0, z_{10})\$, \$(x_0, y_1, z_{01})\$, and \$(x_1, y_1, z_{11})\$. The relative coordinates within the cell are calculated first, then the bilinearly interpolated \$z\$:
$$\begin{aligned}
u &= \frac{x_1 - x}{x_1 - x_0} \\
v &= \frac{y_1 - y}{y_1 - y_0} \\
z &= (1-v)\bigl( (1-u) z_{00} + u z_{10} \bigr) + v \bigl( (1-u) z_{01} + u z_{11} \bigr) \\
~ &= z_{00} + u (z_{10} - z_{00}) + v (z_{01} - z_{00}) + u v ( z_{11} - z_{10} - z_{01} + z_{00} ) \\
\end{aligned}$$
although the first version for the calculation of \$z\$ is numerically stable, whereas the second is not. (I included it to show that it indeed is bilinear: there is a constant coefficient, a linear coefficient for each variable, and a cross coefficient for the product of the two input variables. That kind of equation can always be turned into a more numerically stable form, similar to the upper one of the two.)
For regular rectangular grid and bicubic interpolation, there are two variants. A bicubic patch uses 4×4 points, or eight additional cells surrounding the current one; or you can calculate partial derivatives along each axis for each point, similar to slopes in reply #32 for univariate cubic interpolation.
When each cell is triangular, barycentric coordinates are often used. Each (2D) point inside a triangle is identified by three coordinates, \$u\$, \$v\$, \$w\$, where \$u + v + w = 1\$. This means that only two really only need to be specified, since the third can always be calculated (say, \$w = 1 - u - v\$).
The vertices of the triangle are always at coordinates \$(1, 0, 0)\$, \$(0, 1, 0)\$, and \$(0, 0, 1)\$, regardless of the shape of the triangle. The center (average of the vertex coordinates) is always \$(1/3, 1/3, 1/3)\$. If the \$z\$ at the respective vertices gets values \$z_u\$, \$z_v\$, and \$z_w\$, then the interpolated value is
$$\begin{aligned}
z &= u z_u + v z_v + w z_w = u z_u + v z_v + (1 - u - v) z_w \\
~ &= z_w + u (z_u - z_w) + v (z_v - z_w) \\
\end{aligned}$$
where yet again, the first version is numerically stable. The same interpolation formula applies to \$x\$ and \$y\$ too. Solving for the inverse involving \$x\$ and \$y\$ for \$u\$, \$v\$, and \$w\$, we find
$$\begin{aligned}
D &= x_w (y_u - y_v) + y_w (x_v - x_u) + x_u y_v - x_v y_u \\
u &= \frac{ x (y_v - y_w) - y (x_v - x_w) + x_v y_w - x_w y_v }{D} \\
v &= \frac{ y (x_u - x_w) - x (y_u - y_w) + x_w y_u - x_u y_w }{D} \\
w &= \frac{ x (y_u - y_v) + y (x_v - x_u) + x_u y_v - x_v y_u }{D} \\
\end{aligned}$$
where \$D\$ is twice the triangle area. Again, only two of \$u\$, \$v\$, \$w\$ need to be solved, because the third is trivial to calculate. If any of them is less than \$0\$ or greater than \$1\$, point \$(x, y)\$ is outside the triangle.
For interpolation among random points in \$x, y\$, the points are first converted to a triangular mesh. The valid interpolation area is formed by the convex hull of the points (as if you stretched a rubber band around the points). For each triangular cell, any of the six choices for labeling the vertices is valid.
In the one-dimensional case, we could use binary search to find the correct segment. For triangular meshes in general, most time will be spent locating the correct triangle which contains point \$(x, y)\$. Because of this, it is easier to "walk" along the mesh, and generate e.g. isocurves: a polygon describing constant values of \$z\$ in the \$x, y\$ plane. Looking up individual points is always dominated by the time it takes to find the correct triangle. For walking along the mesh, one typically just records the three triangles sharing sides with the current one, forming a bidirectional connectivity graph between the triangles. Only when the "walk" path exactly crosses a vertex is a search necessary (among all triangles sharing that vertex), but as the triangles around a vertex will always form a connected ring in the connectivity graph, finding the correct triangle is a linear search among the triangles sharing the common vertex. (The search can be made faster by using the 2D equivalent of the vector cross product to compare the walk direction and the direction of the other two vertices for each triangle in \$x, y\$ coordinates, as the product sign depends on their relative orientation, clockwise or counterclockwise. For the correct triangle, the walk direction will be between the two.)
There are fast methods of converting a random set of 2D points into a mesh, the most commonly used one being Delaunay triangulation (https://en.wikipedia.org/wiki/Delaunay_triangulation).
The underlying model can be easily extended to higher number of dimensions, to trilinear and tricubic interpolation. Then, the triangle is replaced by its generalization, simplex, which is a tetrahedron in three dimensions, adding one new coordinate axis for each new dimension. Because of the relative slowness of looking up the correct simplex cell, most commonly regular rectangular grids are used in higher number of dimensions.
Because the number of calibration points in \$D\$ dimensions where you have \$K\$ samples along each Cartesian axis means a total of \$K^D\$ points are needed, and this very rapidly grows to stupid number of points, polynomial and other functions tend to be used instead. There, least squares fitting a smaller number of full-dimensional calibration points is very useful. You can use tools like Scipy, Matlab/Octave for multi-variate fitting.
One scheme that I have seen is to start with the random calibration points, then sample the "slow" triangle mesh to generate a regular rectangular grid. There is typically enough information in the triangular mesh to provide good estimates for bicubic interpolation (surface tangents, or slopes along each grid axis), yielding a smooth and continuous interpolation with acceptable memory use, if the underlying surface being modeled is smooth and continuous itself.
Discontinuities in the interpolated function are the hardest to deal with. Consider e.g. \$f(x) = \frac{1}{x - 2}\$, which has a discontinuity at \$x = 2\$. If you approach it from below, it tends to negative infinity; if you approach it from above, it tends to positive infinity. For one-dimensional functions you can discover the discontinuities and define the function piecewise, but for 2D and higher dimensions, it becomes very complicated, and generally not worth even trying.
-
Hi there,
Not sure where you got that idea from that linear interpolation usually gives poor results. It depends highly on the curve type. With some curves it is going to give very good results.
You can measure the efficiency of the technique though for any curve by comparing the interpolated results for many samples with a given finite table to a table with an increased number of samples stored in it. In one case, using a table of 100 values and linear interpolation resulted in matching the results of a table of 45000 entries and no interpolation. That's a 450 to 1 savings in memory. Of course without question a cubic interpolation could work even better maybe pumping that up even to 4500 to 1, but then the run time calculations take longer.
Obviously:
Linear interpolation is faster (x10).
Linear interpolation occupies more memory (x10).
Cubic interpolation is slower (by a factor of about 10 if not precomputed).
Cubic interpolation occupies less memory (by a factor of about 10).
A good linear table requiring 100 points can be solved with 10 points and cubic interpolation, probably with better accuracy.
The speed is quite fast for a current micro. If the cubic interpolation coefficients are precomputed, we are talking about doing 4 multiplications and 4 additions for each interpolation.
-
The CRITICAL fact of importance here is simply "how accurate do you need to be?"
People often go off down rabbiut holes and spend ages coming up with complex solutions to problems that simply don't require it!
For example, the solution to say a temperature sensor NTC calibration characteristic for a greenhouse temperature sensor (+- 1degC is more than good enough) is very very different to the solution required for say a complex organic chemistry thermostat, where +-0.05degC might be required.
Knowing where you are going is critical to picking the correct route!
Chances are, unless you are doing something very clever and complex, a basic linear interpolation from a look Up Table with carefully chosen index nodes is all that will be required (more nodes where there is a larger "acceleration" of the signal, less where it's more constant)
-
The final accuracy is not very decisive. The NTC calibration can be solved with 100 points and linear interpolation and the other thermostat with 1000 points and linear interpolation.
You could also solve the NTC calibration with 10 points and cubic interpolation and the thermostat calibration with 100 points and cubic interpolation.
The method of chosen index nodes is valid for any interpolation, both linear and cubic.
With temperature (and many other measurements) speed is not important considering how fast any micro calculates today.
One thing is certain: if you are not good at math, the best thing to do is make a LUT. If you're up to the basics, linear interpolation.
If you know first year college math, you can handle cubic interpolation and beyond. if you are an expert in mathematics, you will be able to deal with rational interpolation, the most difficult and capable of approximating all types of curves.
You also have to take into account the time you will spend on understanding and applying methods that are becoming more and more complex.
-
How to calibrate a RTD (Platinum Resistance Thermometry), very accurate termometer, with ITS-90 standard.
Fluke: https://s3.amazonaws.com/download.flukecal.com/pub/literature/3498460B_EN_w.pdf (https://s3.amazonaws.com/download.flukecal.com/pub/literature/3498460B_EN_w.pdf)
Bureau International des Poids et Mesures: https://www.bipm.org/documents/20126/41773843/Guide_ITS-90_5_SPRT_2021.pdf (https://www.bipm.org/documents/20126/41773843/Guide_ITS-90_5_SPRT_2021.pdf)
-
Cubic interpolation is slower [than linear interpolation] (by a factor of about 10 if not precomputed).
No, those figures are not representative, even if you consider 10 as "an approximate order of magnitude".
Let's look at the precomputed implementations first.
Numerically stable linear interpolation uses \$x_i\$, \$x_{i+1}\$, \$A_i\$, and \$B_i\$, so that
$$y = (x_{i+1} - x) A_i + (x - x_i) B_i$$
which is exactly two multiplications and three additions or subtractions. The precomputed coefficients are
$$A_i = \frac{y_i}{x_{i+1} - x_i}, \quad B_i = \frac{y_{i+1}}{x_{i+1} - x_i}$$
Note that there are one less pair of coefficients than there are calibration points.
Numerically stable cubic interpolation uses \$x_i\$, \$x_{i+1}\$, \$A_i\$, \$B_i\$, \$C_i\$, and \$D_i\$, so that
$$\begin{aligned}
y &= (x_{i+1} - x)^3 A_i + (x_{i+1} - x)^2 (x - x_i) B_i + (x_{i+1} - x)(x - x_i)^2 C_i + (x - x_i)^3 D_i \\
~ &= (x_{i+1} - x)^2 \bigl( (x_{i+1} - x) A_i + (x - x_i) B_i \bigr) + (x - x_i)^2 \bigl( (x_{i+1} - x) C_i + (x - x_i) D_i \bigr) \\
\end{aligned}$$
which is exactly eight multiplications and five additions or subtractions. There are three basic ways to precompute the coefficients; using four consecutive points \$(x, y)\$, or using the derivative at each point, or using an estimate of the derivative at each point derived from the previous and next points relative to the current one. The last two I've shown earlier in this thread.
So, with precomputed coefficients, the cubic interpolation is something like 4× the cost of linear interpolation, plus the same overhead for locating the correct segment (binary search performing it in log2(segments) time complexity). The difference is therefore much less than a factor of four. It is also ridiculously fast, involving so few arithmetic operators, that even function call overhead is a significant fraction of the overall time taken per call on many architectures. (The exception is on 8-bit architectures where you use software emulated floating-point. Fixed point is much better suited for this there.)
Not precomputing the coefficients does not save you much. In addition to the values of \$x\$, you need either one (linear interpolation without precomputing, cubic interpolation without precomputing using each set of four consecutive points for each cubic polynomial), two (precomputed linear interpolation, cubic interpolation without precomputing using value and tangent at each point), or four (precomputed cubic interpolation) coefficient per point.
Linear interpolation without precomputing is
$$y = \frac{(x_{i + 1} - x) y_i + (x - x_i) y_{i + 1}}{x_{i+1} - x_i}$$
which uses two multiplications, one division, and four additions or subtractions. The time needed for the division dominates the time taken by this computation, except on those rare-ish architectures with division as fast as multiplication. On most architectures, it takes roughly the same time as cubic interpolation with precomputed coefficients, because the division is just so darn slow compared to multiplications and additions and subtractions.
Cubic interpolation using four consecutive points has a few dozen terms, and I've never optimized it, because it just isn't very useful (in the numerically stable form, which is what I want to use). I can compute the exact formula, if someone wants it, though.
Cubic interpolation using the derivative (known or estimated) at each point, so that we have \$x_i\$, \$y_i\$, and the derivative or slope \$\gamma_i\$, at each point. Then, the interpolation is
$$y = \frac{ (x_{i+1} - x)^2 \biggl( (x_{i+1} - x) y_i + (x - x_i) \bigl( 3 y_i + \gamma_i (x_{i+1} - x_i ) \bigr) \biggr)
+ (x - x_i)^2 \biggl( (x - x_i) y_{i+1} + (x_{i+1} - x) \bigl( 3 y_{i+1} - \gamma_{i+1} (x_{i+1} - x_i ) \bigr) \biggr)
}{(x_{i+1} - x_i)^3}$$
which can be done using 11 multiplications, one division, and 12 addition and subtractions, if you use some temporary variables, in the numerically stable form. Even here, the division is a significant part of the time cost. Still, it is no more than five times the cost of the linear interpolation.
Because importance sampling makes a huge difference in accuracy of the piecewise approximation, all these have the exact same cost of locating the correct segment in \$x\$, and it does take significant time compared to the actual arithmetic operations done to interpolate within the found segment. One would expect something like a factor of 2 to 3 difference in practice between linear and cubic interpolation without precomputed coefficients; and that is very little when considering the much better fit one can get with piecewise cubic curves. On architectures where division takes several times the time a multiplication does, cubic interpolation with precomputed coefficients can be faster than linear interpolation without precomputed coefficients.
Essentially, the precomputed coefficients can eliminate the division, half of the additions and subtractions, and a third of the multiplications: it trades increased memory use for fewer arithmetic operations.
Exact speed differences depend on the relative speed of the division compared to multiplication, and how large a fraction of the overall time does the binary search take to find the correct segment.
To repeat: you do not want to require \$x\$ to be regularly sampled, because importance sampling, i.e. picking more points where where the behaviour changes or is more complex, always yields better results than regular sampling, both in memory use and runtime taken.
Note that the formulas used to precompute the coefficient are not so complicated one could not do them on a microcontroller. That is, you could easily reserve a portion of Flash for the coefficient table, and let the user set individual calibration points. Just keep them in sorted order according to \$x\$. When the user is satisfied, you could compute the coefficients, and update the Flash to reflect the new calibrated coefficients. There is no need to do the precalculation offline, because the coefficient calculations are done only once per calibration point, and even if using software-emulated double-precision floating-point on an 8-bit microcontroller, don't take enough time for a human to notice, until there are hundreds of the calibration points.
-
The factor I wrote (about 10) was to compare the linear interpolation with the cubic interpolation without precalculating the coefficients (Neville's algorithm) and it was an approximation, not an exact figure. My argument was trying to point out the slower speed of cubic interpolation, with the advantage of memory reduction.
Thanks for your clarification.
---------
On the other hand normally we are not going to have a whole table of calibration values available. In many cases we are going to have 3, 4 or 5 calibration points at most. In that case it is best to use an interpolating polynomial of maximum degree and choose well the calibration points.
-
A very interesting application note from TDK for calibrating NTC thermistors for temperature measurement:
https://www.tdk-electronics.tdk.com/download/2999750/74d6ef4ba23ccb21f85f2df587a15eff/ntc-thermistors-an.pdf (https://www.tdk-electronics.tdk.com/download/2999750/74d6ef4ba23ccb21f85f2df587a15eff/ntc-thermistors-an.pdf)
It compares interpolation with 2 calibration points (blue), Steinhart-Hart equation with 3 calibration points (orange) and Lookup Table method (green).
-
In many cases we are going to have 3, 4 or 5 calibration points at most. In that case it is best to use an interpolating polynomial of maximum degree and choose well the calibration points.
Yes. In some cases the polynomial itself is known or specified, for example the 12th degree polynomial for RTDs at -259.35°C to 0°C and the 9th degree polynomial at 0°C to 962°C as specified in ITS-90, and the problem is to adjust the coefficients so that at the calibrated input values, the known result is obtained. The math problem then is that there are not enough calibration points to fix the coefficients.
In practice, using an RTD like a Pt100 or Pt1000 sensor, the software problem is to convert an ADC reading to a temperature, i.e. we have a function T(x) where x is the ADC reading. The function itself is a 12th degree polynomial, so we'd need 12 calibration points to fix the coefficients. Partially "adjusting" the coefficients from only a smaller number of calibration points is mathematically extremely complicated, and not worth trying.
Instead, a two-point calibration is often used, which assumes the shape of the polynomial is the one as specified in ITS-90, and only scale and offset need adjustment. In other words, we assume that there are no nonlinearities in the resistance-to-digital measurement, with T(x) describing the correct behaviour, but requiring a linear correction: offset and scale.
If we have defined the temperature as a function of ADC reading as (piecewise)
$$T(x) = \sum_{i=0}^{12} C_i x^i = C_0 + C_1 x + C_2 x^2 + C_3 x^3 + \dots + C_{11} x^{11} + C_{12} x^{12}$$
then the two-point calibration uses
$$T(A_0 + A_1 x) = \sum_{i=0}^{12} C_i x^i = C_0 + C_1 x + C_2 x^2 + C_3 x^3 + \dots + C_{11} x^{11} + C_{12} x^{12}$$
If the two calibration temperatures are \$T_0\$ and \$T_1\$, with \$T(x_0) = T_0\$ and \$T(x_1) = T_1\$ (which you can use a binary search in \$x\$ because the function \$T\$ is monotonic even though it is a high-degree polynomial), then
$$\left\lbrace \begin{aligned}
A_0 + A_1 x_0 &= T_0 \\
A_0 + A_1 x_1 &= T_1 \\
\end{aligned} \right . \quad \iff \quad \left\lbrace \begin{aligned}
A_0 &= \frac{T_0 x_1 - T_1 x_0}{x_1 - x_0 } \\
A_1 &= \frac{T_1 - T_0}{x_1 - x_0} \\
\end{aligned} \right .$$
with the actual function \$T\$ always being unchanged, and the linear correction applied to the ADC reading before evaluating \$T\$.
If you wanted to apply the linear correction directly to the polynomial coefficient, I can show how that is done, but I recommend against it. \$T(x)\$ tends to not be particularly numerically stable –– it is okay because it is mostly linear, so the higher-degree coefficients are small adjustments to the linear value ––, and applying the linear correction directly to the polynomial coefficients combines them in an even less numerically stable way. Applying the linear correction to \$x\$ as above, on the other hand, yields acceptable results. That is, for ADC reading \$x\$, the corresponding temperature is \$T(A_0 + A_1 x)\$.
Interestingly, this works the same even when T(x) was defined using piecewise cubic or piecewise linear segments, which can be done efficiently using integer or fixed-point arithmetic; useful when floating-point emulation is deemed too slow.
Technically speaking, if you have the theoretical behaviour modeled as a polynomial or a piecewise linear or piecewise cubic or any other type of function, you can apply the linear two-point calibration correction the same way. You only need to find a way to obtain \$x\$ for each calibration point corresponding to the user-specified \$T\$.
For piecewise-defined monotonic functions, the same binary search can be used to find the segment containing the value \$T\$ (aka \$y = f(x)\$). If the points at the end of the segment are \$x_i, y_i\$ and \$x_{i+1}, y_{i+1}\$, \$y = f(x)\$ solved for x is
$$x = \frac{(x_{i+1} - x_i) y + x_i y_{i+1} - x_{i+1} y_i}{y_{i+1} - y_i}$$
For the cubic interpolation, considering the complexity of the solution, I recommend doing a binary search within \$x_i\$ and \$x_{i+1}\$ to the desired resolution (each iteration adds one bit of precision to \$x\$) instead, considering this is only needed during calibration, and only needs to be sufficiently fast.
In the general case, for monotonic functions, you can solve for x using a binary search in calibrate(x). It won't be the fastest possible option, but it will be fast enough for calibration, even at say 64 bits of precision.
-
It depends. What function do you want to approximate and with what accuracy?
-
Hello again,
Just to note, there are more numerical interpolation methods than there are stars in the sky :phew: