question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pandas interpolate inconsistent results with axis and method ffill

See original GitHub issue

Problem: I am trying to interpolate a dataframe of mxn with na values. The default axis is 0 (index), so if I were to interpolate at axis 0, df should fill missing values (say at index n) for each column with values from index n-1 and n+1 (or n+p, where n+p is the closest index with a valid value). This holds for default linear method but not for ffill method.

Code: data = np.array([[1,2,3,4, np.nan, 5], [2,4,6,np.nan, 8, 10], [3, 6, 9, np.nan, np.nan, 30]]).T d = pd.DataFrame(data, columns=['A', 'B', 'C']) d A B C 0 1.0 2.0 3.0 1 2.0 4.0 6.0 2 3.0 6.0 9.0 3 4.0 NaN NaN 4 NaN 8.0 NaN 5 5.0 10.0 30.0

d.interpolate(method='ffill', axis=1)

A B C 0 1.0 2.0 3.0 1 2.0 4.0 6.0 2 3.0 6.0 9.0 3 4.0 6.0 9.0 4 4.0 8.0 9.0 5 5.0 10.0 30.0 d.interpolate(method='ffill')

 A     B     C

0 1.0 2.0 3.0 1 2.0 4.0 6.0 2 3.0 6.0 9.0 3 4.0 4.0 4.0 4 NaN 8.0 8.0 5 5.0 10.0 30.0 d.interpolate()

 A     B     C

0 1.0 2.0 3.0 1 2.0 4.0 6.0 2 3.0 6.0 9.0 3 4.0 7.0 16.0 4 4.5 8.0 23.0 5 5.0 10.0 30.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
CloseChoicecommented, May 3, 2020

For explanation: ffill takes the last valid entry along the given axis and fills NaNs.

   A   B   C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 NaN NaN
4 NaN 8.0 NaN
5 5.0 10.0 30.0

Given axis=0 means going along the indices and take the last valid entry and therefore the expected result is:

df.interpolate(method='ffill', axis=0)
   A   B   C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 4.0 4.0
4 NaN 8.0 8.0
5 5.0 10.0 30.0

Given axis=1 means going along the columns:

df.interpolate(method='ffill', axis=1)
   A   B   C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 6.0 9.0
4 4.0 8.0 9.0
5 5.0 10.0 30.0

But for example

df
     A     B     C
3  4.0   NaN   NaN
4  NaN   8.0   NaN
5  5.0  10.0  30.0

df.interpolate(method='ffill', axis=1)
     A     B    C
3  4.0   NaN   NaN
4  4.0   8.0   NaN
5  5.0  10.0  30.0
0reactions
saarahrasheedcommented, May 7, 2020

Hi. Thanks for the prompt responses and explanation. I believe you’re right. Sorry the confusion was at my end! Thanks again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DataFrame.interpolate() extrapolates over trailing missing data
Is there way to instruct pandas to not extrapolate past the last non-missing value in a series? EDIT: I'd still love to see...
Read more >
Interpolation using pandas - Numpy Ninja
Interpolation is one of the methods of filling null values. Before learning about interpolation, let us learn why do we need interpolation.
Read more >
pandas.DataFrame.interpolate — pandas 1.5.2 documentation
Returns the same object type as the caller, interpolated at some or all NaN values or None if inplace=True . See also. fillna....
Read more >
Interpolation | Interpolation in Python to Fill Missing Values
The linear method ignores the index and treats missing values as equally spaced and finds the best point to fit the missing value...
Read more >
How Interpolate Function works in Pandas? - eduCBA
It utilizes different interjection procedure to fill the missing qualities ... Pandas.interpolate(axis=0, method='linear', inplace=False, limit=None, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found