question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement polyfit?

See original GitHub issue

Fitting a line (or curve) to data along a specified axis is a long-standing need of xarray users. There are many blog posts and SO questions about how to do it:

The main use case in my domain is finding the temporal trend on a 3D variable (e.g. temperature in time, lon, lat).

Yes, you can do it with apply_ufunc, but apply_ufunc is inaccessibly complex for many users. Much of our existing API could be removed and replaced with apply_ufunc calls, but that doesn’t mean we should do it.

I am proposing we add a Dataarray method called polyfit. It would work like this:

x_ = np.linspace(0, 1, 10)
y_ = np.arange(5)
a_ = np.cos(y_)

x = xr.DataArray(x_, dims=['x'], coords={'x': x_})
a = xr.DataArray(a_, dims=['y'])
f = a*x
p = f.polyfit(dim='x', deg=1)

# equivalent numpy code
p_ = np.polyfit(x_, f.values.transpose(), 1)
np.testing.assert_allclose(p_[0], a_)

Numpy’s polyfit function is already vectorized in the sense that it accepts 1D x and 2D y, performing the fit independently over each column of y. To extend this to ND, we would just need to reshape the data going in and out of the function. We do this already in other packages. For dask, we could simply require that the dimension over which the fit is calculated be contiguous, and then call map_blocks.

Thoughts?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:8
  • Comments:25 (19 by maintainers)

github_iconTop GitHub Comments

2reactions
shoyercommented, Sep 30, 2019

From a user perspective, I think people prefer to find stuff in one place.

From a maintainer perspective, as long as it’s somewhat domain agnostic (e.g., “physical sciences” rather than “oceanography”) and written to a reasonable level of code quality, I think it’s fine to toss it into xarray. “Already exists in NumPy/SciPy” is probably a reasonable proxy for the former.

So I say: yes, let’s toss in polyfit, along with fast fourier transforms.

If we’re concerned about clutter, we can put stuff in a dedicated namespace, e.g., xarray.wrappers.

1reaction
aulemahalcommented, Feb 3, 2020

I pushed a new PR trying to implement polyfit in xarray, #3733. It is still work in progress, but I would like the opinion on those who participated in this thread.

Considering all options discussed in the thread, I chose an implementation that seemed to give the best performance and generality (skipping NaN values), but it duplicates a lot of code from numpy.polyfit.

Main question:

  • Should xarray’s implementation really replicate the behaviour of numpy’s?

A lot of extra code could be removed if we’d say we only want to compute and return the residuals and the coefficients. All the other variables are a few lines of code away for the user that really wants them, and they don’t need the power of xarray and dask anyway.

I’m guessing @huard @dcherian @rabernat and @shoyer might have comments.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Polynomial curve fitting - MATLAB polyfit - MathWorks
Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which improves the numerical properties of the problem. polyfit ......
Read more >
patLoeber/Polyfit: C++ implementation of polyfit - GitHub
Polyfit. C++ implementation of polyfit, with optional weighting like in numpy. Two implementations, one with boost/ublas lib, and one with the Eigen lib....
Read more >
numpy.polyfit — NumPy v1.24 Manual
Fit a polynomial p(x) = p[0] * x**deg + ... + p[deg] of degree deg to points (x, y). Returns a vector of...
Read more >
polyfit (MATLAB Functions)
[p,S] = polyfit(x,y,n) returns the polynomial coefficients p and a structure S for use with polyval to obtain error estimates or predictions.
Read more >
Fully Explained PolyFit Method for Machine Learning ...
For linear regression, we use one degree for fit. We can also manage the limits of the x-axis range with the help of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found