Calling df.loc with multiple arguments results in KeyError
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Version 11.6.4
- Modin version (
modin.__version__
): 0.14.0 - Python version: Python 3.8.11
- Code we can use to reproduce:
import modin.pandas as pd
import numpy as np
arrays = [
np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
np.array(["one", "two", "one", "two", "one", "two", "one", "two"]),
]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df.loc['bar', 'one']
Resulting Error:
KeyError Traceback (most recent call last)
<ipython-input-7-5557f8ed36a3> in <module>
----> 1 df.loc['bar', 'one']
~/Desktop/modin/modin/pandas/indexing.py in __getitem__(self, key)
636 return self._handle_boolean_masking(row_loc, col_loc)
637
--> 638 row_lookup, col_lookup = self._compute_lookup(row_loc, col_loc)
639 result = super(_LocIndexer, self).__getitem__(row_lookup, col_lookup, ndim)
640 if isinstance(result, Series):
~/Desktop/modin/modin/pandas/indexing.py in _compute_lookup(self, row_loc, col_loc)
843 else axis_loc
844 )
--> 845 raise KeyError(missing_labels)
846
847 if isinstance(axis_lookup, pandas.Index) and not is_range_like(axis_lookup):
KeyError: array(['one'], dtype='<U3')
Expected Output (with pandas):
0 0.395674
1 -0.426304
2 0.273483
3 -0.702982
Name: (bar, one), dtype: float64
Describe the problem
Calling df.loc with multiple arguments results in Modin believing there are missing labels and therefore raises a KeyError.
Source code / logs
Issue Analytics
- State:
- Created a year ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
KeyError Pandas – How To Fix - Data Independent
Pandas KeyError - This annoying error means that Pandas can not find your column name in your dataframe. Here's how to fix this...
Read more >How to Fix: KeyError in Pandas - GeeksforGeeks
Usually, this error occurs when you misspell a column/row name or include an unwanted space before or after the column/row name.
Read more >Pandas KeyError when using .loc() [duplicate] - Stack Overflow
Whenever I've done this I use a tuple, not a list, of row labels. Maybe that's the issue? There is a df[(row1, row2),...
Read more >How to Fix KeyError in Pandas (With Example) - Statology
KeyError : 'column_name'. This error occurs when you attempt to access some column in a pandas DataFrame that does not exist.
Read more >Indexing and selecting data - Pandas - PyData |
A callable function with one argument (the calling Series or DataFrame) and that ... For instance, in the above example, s.loc[2:5] would raise...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@alvin-chang An easy workaround for this issue would be to separate out the calls to
.loc
. For instance in the case listed above you could dodf.loc['bar'].loc['one']
. This should unblock you while we work towards putting in a fix.@anmyachev, if Modin behavior does not match the pandas behavior, we issue a warning like this. https://github.com/modin-project/modin/blob/f41432c1c746c6a6186c376594c6c7f7dd24cdb5/modin/core/storage_formats/pandas/query_compiler.py#L2038