Clarifying the mask propogation in np.ma.polyfit in documentation
See original GitHub issueI would like to add an example in the documentation of np.ma.polyfit
to demonstrate that the 2D mask is collapsed on 1D mask before doing the polyfit of a masked array. The current documentation is not very clear on how np.ma.polyfit
deals with 2D mask while fitting a 2D y array.
For example.
A = np.ma.array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[10, 14, 99]])
x = np.arange(A.shape[0])
# only the mA[3,2] entry is masked in mA below
mA = np.ma.masked_greater(A,90)
print(mA)
# polynomial fit of all three columns of A
np.ma.polyfit(x,A,1)
# Outputs: array([[ 3.1, 4.3, 29.8],
[ -1.4, -2.2, -19.2]])
# polynomial fit of all three columns of mA
np.ma.polyfit(x,mA,1)
# Outputs: array([[ 1.00000000e+00, 1.00000000e+00, 1.00000000e+00],
[ 4.10073934e-17, 4.10073934e-17, 4.10073934e-17]])
# Instead of expected: array([[ 3.1, 4.3, 1.00000000e+00],
[ -1.4, -2.2, 4.10073934e-17]])
Important Note: Masking one element in the last column affected the polynomial fit of all columns.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:12 (4 by maintainers)
Top Results From Across the Web
numpy.ma.polyfit — NumPy v1.24 Manual
Computes spline fits. Notes. Any masked values in x is propagated in y, and vice-versa. The solution minimizes the squared error.
Read more >numpy.ma.polyfit — NumPy v1.6 Manual (DRAFT)
numpy.ma.polyfit(x, y, deg, rcond=None, full=False)¶ ... Any masked values in x is propagated in y, and vice-versa. References ...
Read more >Numpy.ma polyfit function for masked arrays crashes on ...
It is indeed a bug from numpy.ma : the rcond (a parameter to exclude some values ) takes len(x)*np.finfo(x.dtypes).eps as a default value, ......
Read more >Release Notes — NumPy v1.14 Manual
emulating a function that sometimes returns np.ma.masked val ... The documentation for structured arrays in the user guide has been ...
Read more >Lecture Notes - Gaël Varoquaux
and much more packages not documented in the scipy lectures. ... x = np.ma.array([1, 2, 3, 4], mask=[0, 1, 0, 1]) ... p...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@indiajoe - yes, indeed, although I hadn’t specifically considered
np.unique
(which is quite suitable in this case for returning the indices, but a bit unhandy in almost certainly requiring a copy of the mask). The only thing would be to generalize the.T
to a.swapaxis(axis, 0)
so one can have an arbitrary axis.As for backwards compatibility and a possible keyword argument, perhaps best would be to ask on the mailing list – I do agree that the current behaviour is surprising, so more likely that people got wrong results out of it than that they rely on it to get correct ones.
More I think about it, I feel
np.ma.polyfit
should not be combining all the masks into a single union mask. I can’t think of any application where one would just want to throw away data points from the fitting of independent columns just because some columns have certain rows masked.I understand the compromise one has to make is in the computation time, since one will have to loop over all columns to get unique masks for each column.
Wouldn’t it be better to slice the columns containing similar mask pattern and pass them as a set to
np.polyfit
? That way, we will have to loop over only the number of times column masks are different.If others also agree to that approach, I could implement that and submit a pull request.