Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ENH: Solving two-step interpolation in scipy

See original GitHub issue

Is your feature request related to a problem? Please describe.

This is being cross-posted here from SO.

This problem is a little complicated, so I will try to make it as simple as possible.

Say that I have a table of values for which I want to perform two-step 1d interpolation; i.e., interpolation of a 2d value via interpolating in 1d twice. Note that this is a different problem from that being solved by scipy.interpolate.interp2d, and the appropriate algorithm is going to be quite different.

Here is an example of source data for performing the interpolation. The column labels are the dependent variable, Z. The row labels are the SECOND independent variable, Y. The 2d, interior portion of the table is the FIRST independent variable, X.

   Z  1  2  3
Y
1     1  2  3   <-- X row 1
2     3  4  5   <-- X row 2

We want to produce a factory function. The factory produces another function which, given the above source data, performs appropriate twice-interpolation of any x,y pair.

Describe the solution you’d like.

Here I will call the factory function twice_interp1d_with_2d_x, and it will look like this;

def twice_interp1d_with_2d_x(x, y, z):
    """Produce an interpolating function of the form:
           f(x,y)
       The function interpolates an x and y value to a z value.
       Expects a 2d x argument. The y-axis is expected to be the rows; z is the columns."""
    ...

Here is the same function with type information attached (all variables are array_like; x is expected to be 2d, y and z are 1d):

def twice_interp1d_with_2d_x(x: array_like, y: array_like, z: array_like) -> Callable[[array_like, array_like], array_like]: ...

Describe alternatives you’ve considered.

Using the existing scipy.interpolate.interp1d function (which includes additional bounds_error and fill_value arguments), I have already written this function (but it is broken; see below), and the basic solution works like this:

from scipy.interpolate import interp1d

def twice_interp1d_with_2d_x(x, y, z, bounds_error=True, fill_value=None):
    interp1d_rows = [interp1d(row, z, bounds_error=bounds_error, fill_value=fill_value) for row in x]
    def interpolator(x_inner, y_inner):
        temp_z = [f(x_inner) for f in interp1d_rows]
        return interp1d(y, temp_z, bounds_error=bounds_error, fill_value=fill_value)(y_inner)
    return interpolator

This works for many situations:

>>> twice_interp1d_with_2d_x(X, Y, Z)(3, 2)
array(1.)

Side note: interp1d always produces numpy arrays; this result is a 0d (ie, atomic) array

Additional context (e.g. screenshots, GIFs)

But there is a big problem. Consider the case of (x=4, y=2). The solution to this case should be:

>>> twice_interp1d_with_2d_x(X, Y, Z)(4, 2)
array(2.)

But instead, we get a ValueError because the value of 4 is beyond the upper bounds of the first row in X. We can try to fix this by passing bounds_error=False, but then we just get a nan:

>>> twice_interp1d_with_2d_x(X, Y, Z, bounds_error=False)(4, 2)
array(nan)

To make matters even more complicated, sometimes an array will be passed as the first argument or second argument, or BOTH arguments, and we might have to deal with np.nan for some combinations, but not others:

>>> twice_interp1d_with_2d_x(X, Y, Z, bounds_error=False)([3, 4], 2)
array([nan,  2.])

The question is: is there a simple, concise way to produce a correct version of this function using modern idiomatic python and numpy that I am overlooking? I know I could solve this using line after line of logic-- but it is going to be hours of work. Is this solvable in some quick, elegant way that I am not thinking of?

If not, it seems to me something like this would be a good candidate to be added to the scipy.interpolate module. Note that I have limited the problem description to the case of a table with a 2d x value; however, it could just as easily be a 2d y or 2d z value (but the case of the z value is simple: no boundary error problem exists).

Issue Analytics

State:
Created 2 years ago
Comments:16 (7 by maintainers)

Top GitHub Comments

1reaction

ev-brcommented, Aug 9, 2022

OK, at this stage ISTM that the original query has been responded to (even if not in a fully complete way), a possibly full solution can be cooked up using tools scipy.interpolate provides already, and there does not seem to be much to track for a possible enhancement. Closing, but do feel free to keep discussing.

1reaction

Patol75commented, Sep 23, 2021

It seems to me that what you are trying to achieve, at least in the way that you described it, is similar to an inverse problem for interp2d. Using interp2d, the logical approach would be to find an interpolated value x within X given y and z coordinates that belong, respectively, to Y and Z. However, here, it looks like you already have x, and you also know at which y it was sampled. Thereby, you are asking the question: given y, at which value z of Z should I interpolate X to obtain x? At least, that is the way I see it from the summary you gave. If this is correct, I am tempted to say this is a fair question, but I am not sure if it is in the current scope of SciPy. One limitation though is that it seems you can have repeated values in Z, which is problematic.

Regarding your issue with nan values, I have suggested a solution on StackOverflow, which I am cross-posting here to illustrate that it seems to work for the test case you posted above.

import numpy as np
import pytest
from scipy.interpolate import interp1d


def fun(X, Y, Z, x, y, **kwargs):
    X, Y, Z = np.asarray(X), np.asarray(Y), np.asarray(Z)
    if y in Y:
        return interp1d(X[np.nonzero(Y == y)[0][0]], Z, **kwargs)(x)
    else:
        return interp1d(Y, [interp1d(row, Z, **kwargs)(x) for row in X],
                        **kwargs)(y)


@pytest.mark.parametrize("x, y, expected", [
    (0, 1, 1), (5, 1, 1), (70, 1, 0), (90, 1, 0), (0, 1.1, 1), (10, 1.1, 1),
    (70, 1.1, 0), (90, 1.1, 0), (0, 1.2, 1), (15, 1.2, 1), (70, 1.2, 0),
    (90, 1.2, 0), (1, 1, 1), (37.5, 1, 0.5), (80, 1, 0), (1, 1.1, 1),
    (40, 1.1, 0.5), (80, 1.1, 0), (1, 1.2, 1), (42.5, 1.2, 0.5), (80, 1.2, 0)])
def test_interp2d_FIG7P4D1_Cs(x, y, expected):
    x_arr = [[0, 5, 70, 90], [0, 10, 70, 90], [0, 15, 70, 90]]
    y_arr = [1, 1.1, 1.2]
    z_arr = [1, 1, 0, 0]
    result = fun(x_arr, y_arr, z_arr, x, y)
    np.testing.assert_array_equal(result, expected)