Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Curve Fitting by Polynomials

A polynomial fit is an extension to the simple linear regression. Instead of fitting a straight line a polynomial curve is adjusted in a way that the sum of squares of the residuals εj becomes a minimum.

yj = f(xj) = a0 + a1xj + a2xj2 + .... + anxjn + εj = Σakxjk + εj

The highest power of the polynomial equation determines the order of the polynomial. The higher the order of the polynomial the better the possible fit to the data (which does not necessarily mean that polynomials of higher order deliver better models). There is a basic rule as far as the required minimum number of data points is concerned: the minimum number of data points has to be one greater than the order of the polynomial. However, it is a good idea to use substantially more data points than this minimum requirement. As a rule of of thumb the number of data points should be at least twice the order of the polynomial. If the number of measured data points is too low (or the order of the polynomial is too high), the generalisation of the function becomes poor, resulting in unreliable estimated values.

In principle, the selection of the order of the polynomial should be governed by the underlying physical principles of the relationship to be modeled. For example, if we set up a model relating the recorded mass in a mass spectrometer to the measured voltage of the Hall sensor (a relationship which is quadratic in nature), we should parameterize a parabolic function (a polynomial of order 2) and not any other polynomial relationship.