pandas.Dataframe.interpolate() does not extrapolate even if it is asked to, depending on interpolation method
See original GitHub issueCode Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
a = pd.Series([0, 1, np.nan, 3, 4, np.nan, np.nan, np.nan, np.nan])
a_int=a.interpolate(method='cubic', limit_area=None)
Problem description
Some of the offered methods (it seems all of them that are provided by interp1d) are unable to extrapolate over np.nan. However, the limit_area switch for df.interpolate() indicates you can force extrapolation. A combination of limit_area=None and an incompatible method should raise a warning.
There used to be a similar issue where extrapolation over trailing NaN was done unintentionally, so maybe the fix for that overdid it. https://github.com/pandas-dev/pandas/issues/8000
Expected Output
Extrapolation over the NaNs in the array is expected. Using a different method, such as pchip achieves this.
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit : None python : 3.7.2.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 63 Stepping 2, GenuineIntel byteorder : little LC_ALL : None LANG : en LOCALE : None.None
pandas : 0.25.3 (also tested with 1.0.0) numpy : 1.15.4 pytz : 2018.9 dateutil : 2.7.5 pip : 20.0.2 setuptools : 41.0.1 Cython : 0.29.15 pytest : None hypothesis : None sphinx : 1.8.3 blosc : None feather : None xlsxwriter : None lxml.etree : 4.3.3 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.10 IPython : 7.5.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : 4.3.3 matplotlib : 3.0.3 numexpr : None odfpy : None openpyxl : 2.5.12 pandas_gbq : None pyarrow : None pytables : None s3fs : None scipy : 1.2.1 sqlalchemy : None tables : None xarray : None xlrd : 1.2.0 xlwt : None xlsxwriter : None
Issue Analytics
- State:
- Created 4 years ago
- Reactions:9
- Comments:10 (2 by maintainers)
Top GitHub Comments
I second this.
Also, even when it works, it doesn’t. The implied meaning of “extrapolate” is that it will continue on the last available trend. However, the observed result is that the last value is repeated.
In:
Out:
@khaeru @lyndonchan To extrapolate in both directions, use
limit_direction="both"
, which is not obvious at all.This gives: