Wrong definition of weights in numpy.polyfit
See original GitHub issueThe documentation below for numpy.polyfit is incorrect/misleading regarding the definition of the optional input weights vector w
http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html
In least-squares fitting one generally defines the weights vector in such a way that the fit minimizes the squared error (in Numpy notation)
chi2 = np.sum(weights*(p(x) - y)**2)
In common situation where the 1σ errors “sigma” are known one has that the weights are the reciprocal of the variance
weights = 1/sigma**2
see e.g. http://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares or http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd432.htm
However the numpy.polyfit documentation defines the weight as “weights to apply to the y-coordinates”. This definition is not correct. The weights apply to (=multiply) the fit residuals, not only to the y-coordinates.
More importantly, looking at the math in the Numpy (v1.9.1) code, the resulting definition of squared residuals adopted by polyfit is the following, with the optional input weights vector w inside the parenthesis, contrary to standard practice
chi2 = np.sum((w*(p(x) - y))**2)
in such a way that the relation between w and the 1σ errors is
w = 1/sigma
which is different from what everybody will expect.
The confusion in the documentation likely arises from the fact that the Numpy code solves the linear problem below in the last-squares sense, where the w vector does multiply the y-coordinate
(vander*w[:, np.newaxis]).dot(x) == y*w
And solving the above array expression in the least-squares sense is equivalent to minimizing the expression below with w inside the parenthesis
np.sum((w*(vander.dot(x) - y))**2)
A non-optimal solution, to maintain compatibility, would be to change the documentation and clearly define the weight w by including it in the equation for the “squared error” E given in the Notes. One should also make clear that the adopted definition differs from standard practice by giving the relation between weights and error w=1/σ
Even better would be to define a new optional keyword weights, which follows standard practice and satisfies weights = 1/sigma**2. In this case, in the code one should simply calculate w=np.sqrt(weight) of the input weights and the rest of the code applies unmodified.
Issue Analytics
- State:
- Created 9 years ago
- Reactions:1
- Comments:17 (6 by maintainers)
Top GitHub Comments
Just ran into this problem myself. I agree 100% with the suggestions to add an alternative “sigma” or “weights” option, and deprecating “w”.
There are lots of uses for weights besides normalizing the variance, for instance, masking or robust least squares (IRLS).