pandas interpolate inconsistent results with axis and method ffill
See original GitHub issueProblem: I am trying to interpolate a dataframe of mxn with na values. The default axis is 0 (index), so if I were to interpolate at axis 0, df should fill missing values (say at index n) for each column with values from index n-1 and n+1 (or n+p, where n+p is the closest index with a valid value). This holds for default linear method but not for ffill method.
Code:
data = np.array([[1,2,3,4, np.nan, 5], [2,4,6,np.nan, 8, 10], [3, 6, 9, np.nan, np.nan, 30]]).T
d = pd.DataFrame(data, columns=['A', 'B', 'C'])
d
A B C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 NaN NaN
4 NaN 8.0 NaN
5 5.0 10.0 30.0
d.interpolate(method='ffill', axis=1)
A B C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 6.0 9.0
4 4.0 8.0 9.0
5 5.0 10.0 30.0
d.interpolate(method='ffill')
A B C
0 1.0 2.0 3.0
1 2.0 4.0 6.0
2 3.0 6.0 9.0
3 4.0 4.0 4.0
4 NaN 8.0 8.0
5 5.0 10.0 30.0
d.interpolate()
A B C
0 1.0 2.0 3.0 1 2.0 4.0 6.0 2 3.0 6.0 9.0 3 4.0 7.0 16.0 4 4.5 8.0 23.0 5 5.0 10.0 30.0
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
For explanation: ffill takes the last valid entry along the given axis and fills
NaN
s.Given
axis=0
means going along the indices and take the last valid entry and therefore the expected result is:Given
axis=1
means going along the columns:But for example
Hi. Thanks for the prompt responses and explanation. I believe you’re right. Sorry the confusion was at my end! Thanks again.