defaulting to pandas on a reindex causes a raise
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): python 3.8.3-slim (docker)
- Modin version (
modin.__version__
): 0.7.4 - Python version: 3.8.3
- Code we can use to reproduce:
Describe the problem
I have a specific case where I’m building a timeseries dataframe and then backfilling some data.
to do this, I turn two columns into a multiindex. then a create a new index from all of the values I’d like to see backfilled.
then I set_index and reindex on the new “fuller” index.
I looked through the modin code and it looks like it doesnt support reindexing on a multiindex, which is totally understandable. but then what happens is that it appears to default to the original pandas incorrectly. I THINK its because some defaults are being set at the top of the reindex method (as part of the check) but then these defaults are being passed to the baseline pandas when normally it would get Nones (e.g. axis or index parameters).
I tried to simulate my case with some dummy code. not sure if it makes it clearer or is more confusing 😃
Source code / logs
>>> df = pandas.DataFrame({"foo": [1,2,3,4], "bar": ["a", "b", "c", "d"], "waldo": [11, 12, 13, 14]})
UserWarning: Distributing <class 'dict'> object. This may take some time.
>>> df = df.set_index(["foo", "bar"])
>>> df
waldo
foo bar
1 a 11
2 b 12
3 c 13
4 d 14
>>> new_index = pandas.MultiIndex.from_product([["a", "b", "c"], ["d", "e", "f"]])
>>> df.reindex(new_index)
UserWarning: `DataFrame.reindex` defaulting to pandas implementation.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/orenmazor/.pyenv/versions/3.8.2/Python.framework/Versions/3.8/lib/python3.8/site-packages/modin/pandas/base.py", line 2038, in reindex
return self._default_to_pandas(
File "/Users/orenmazor/.pyenv/versions/3.8.2/Python.framework/Versions/3.8/lib/python3.8/site-packages/modin/pandas/base.py", line 251, in _default_to_pandas
result = getattr(getattr(pandas, self.__name__), op)(
File "/Users/orenmazor/.pyenv/versions/3.8.2/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/util/_decorators.py", line 227, in wrapper
return func(*args, **kwargs)
File "/Users/orenmazor/.pyenv/versions/3.8.2/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/frame.py", line 3851, in reindex
axes = validate_axis_style_args(self, args, kwargs, "labels", "reindex")
File "/Users/orenmazor/.pyenv/versions/3.8.2/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/util/_validators.py", line 260, in validate_axis_style_args
raise TypeError(msg)
TypeError: Cannot specify both 'axis' and any of 'index' or 'columns'.
>>> ```
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
@drPytho thanks for the bump, here is the current workaround:
Hope this helps (in the short term). I will make sure this gets fixed for the next release.
This issue was fixed by #2660