Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: "IndexingError: Too many indexers" when accessing a None value using .loc through a MultiIndex

See original GitHub issue

Steps to reproduce

print(pd.Series(
    [None],
    pd.MultiIndex.from_arrays([['Level1'], ['Level2']]))
    .loc[('Level1', 'Level2')])

Expected output

None

Actual output

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key)
   1760                 except (KeyError, IndexError, AttributeError):
   1761                     pass
-> 1762             return self._getitem_tuple(key)
   1763         else:
   1764             # we by definition only have the 0th axis

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
   1275 
   1276         # no multi-index, so validate all of the indexers
-> 1277         self._has_valid_tuple(tup)
   1278 
   1279         # ugly hack for GH #836

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _has_valid_tuple(self, key)
    699         for i, k in enumerate(key):
    700             if i >= self.ndim:
--> 701                 raise IndexingError("Too many indexers")
    702             try:
    703                 self._validate_key(k, i)

IndexingError: Too many indexers

Additional information

If any value other than None (even np.nan) is used, the code behaves correctly.

If a single index level is used, the code behaves correctly.

Workaround

Seems to work if .loc(axis=0)[('Level1', 'Level2')] is used instead.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : None python : 3.8.3.candidate.1 python-bits : 64 OS : Linux OS-release : 5.6.0-1-amd64 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 1.0.3 numpy : 1.18.2 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.3 Cython : None pytest : 4.6.9 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.5.0 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.1 IPython : 7.13.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : 4.5.0 matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : 4.6.9 pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : None

Issue Analytics

State:
Created 3 years ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

jorisvandenbosschecommented, May 28, 2020

The problem, I think, is that False in principle can also be a valid result (so similar problem as with None).

So I think we need another way: either by raising an error and catching that in the layer above, or either with a custom object like no_result = object() and then if result != no_result: return result

1reaction

jorisvandenbosschecommented, May 26, 2020

@pedrooa Sure, that would be very welcome!