question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Comparison with xarray's built-in interp()

See original GitHub issue

Recently realized that xarray has a nice new interpolation module added by @crusaderky and @fujiisoup in pydata/xarray#2079 (also cc @shoyer, @rabernat).

It does have some overlap with xESMF (fortunately not too much), so I think it would be necessary to:

  • Compare their advantages & limitations
  • Identify their proper use cases
  • Better define the development roadmap and avoid duplication of effort

interp() wraps scipy.interpolate; while xESMF wraps ESMPy’s regridding functionality. The subtle difference between “interpolation” and “regridding” is that, the former often refers to traditional 1D interpolation (sometimes N-D), mostly in Cartesian space; while the latter specifically means geospatial data on Earth’s sphere.

I personally think interp() is a great fit for:

  • Interpolation over 1D coordinate (e.g. vertical layers, time). I’ve been using Scipy for vertical interpolation, too. ESMPy does support 3D grid, but this is generally an overkill.
  • Data that are not on Earth’s sphere. Say the output from any other physical models. ESMPy can actually handle Cartesian coordinates, but everyone seems to only use spherical coordinates
  • High-dimensional regular grid (>=4D? rarely seen in Earth science but can occur in other physical sciences or machine learning). Seems like interp’s API tries to generalize to arbitrary dimensions, while xESMF is specific to horizontal regridding on the sphere.
  • Sampling over a trajectory via “Advanced Interpolation”. ESMPy does have a similar support via LocStream¶ but I personally don’t use it. Glad that xarray has native support for this feature.

For geospatial regridding tasks, xESMF has some important strengths. Many already reviewed in the docs, but more specifically:

  • Performance, especially with large data. xESMF reuses weights but scipy.interpolate does not. This simple test shows that xESMF is 16x faster than interp(), once the weights are computed (computing weights is also faster than interp). Indeed, this performance gap will be narrowed down on distributed cloud platforms, where the I/O time dominates (pangeo-data/pangeo#334).
  • Curvilinear grid (pydata/xarray#2281). This is the major reason why I wrote xESMF…
  • Conservative algorithm, to conserve the integral for density-like fields such as air density, heat flux, emission intensity… (This algorithm is used everywhere in Earth science but is never taught in numerical analysis classes. Scipy is basically “everything in a numerical analysis textbook”, so unsurprisingly it has no conservative scheme.)

In short, interp is a general-purpose interpolation module; xESMF is a geospatial regridding package targeting at Earth science needs. Looks like their objectives can be distinguished. Should we consider merging some efforts? Or just perhaps let them evolve independently?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:9 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
shoyercommented, Jul 13, 2018

In short, interp is a general-purpose interpolation module; xESMF is a geospatial regridding package targeting at Earth science needs. Looks like their objectives can be distinguished

I agree!

Should we consider merging some efforts?

In the long term, we want an external interface that allows for extending xarray with custom index/grid objects as an explicit part of our data model, e.g., for geospatial indexing. This would allow for caching some indexing/regridding computations and potentially allow even for extending interp in xarray by third-party libraries.

0reactions
kthyngcommented, Sep 17, 2020

@dcherian I can work on this after I figure out some interpolation better than I currently do. It’s on my to-do list!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error in calculating integral for 2D interpolation. Comparing ...
and two identical arrays grid_xl and grid_xu with dinamic increment of values. When I run the code I get this: K(grid_xl, grid_xu) Traceback...
Read more >
numpy.interp — NumPy v1.24 Manual
Returns the one-dimensional piecewise linear interpolant to a function with given discrete data points (xp, fp), evaluated at x. Parameters: xarray_like.
Read more >
xarray.DataArray.interp
Performs univariate or multivariate interpolation of a DataArray onto new coordinates using scipy's interpolation routines. If interpolating along an existing ...
Read more >
scipy.interpolate.interp2d — SciPy v1.9.3 Manual
Arrays defining the data point coordinates. If the points lie on a regular grid, x can specify the column coordinates and y the...
Read more >
Solved 1.2 Comparing interpolation methods function (fig
Comparison of interpolation methods o original data -linear lagrange spline ... the built-in MATLAB function contains () useful → tells you if a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found