Dict/dict keys in DataFrame.__getitem__
See original GitHub issueCode Sample, a copy-pastable example if possible
In [2]: pd.Series(index=range(10))[{1:2,3:4}]
Out[2]:
1 NaN
3 NaN
dtype: float64
In [3]: pd.Series(index=range(10))[{1:2,3:4}.keys()]
Out[3]:
1 NaN
3 NaN
dtype: float64
In [4]: pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-5b29d5d42cf2> in <module>()
----> 1 pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}]
~/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
2683 return self._getitem_multilevel(key)
2684 else:
-> 2685 return self._getitem_column(key)
2686
2687 def _getitem_column(self, key):
~/nobackup/repo/pandas/pandas/core/frame.py in _getitem_column(self, key)
2690 # get column
2691 if self.columns.is_unique:
-> 2692 return self._get_item_cache(key)
2693
2694 # duplicate columns & possible reduce dimensionality
~/nobackup/repo/pandas/pandas/core/generic.py in _get_item_cache(self, item)
2482 """Return the cached item, item represents a label indexer."""
2483 cache = self._item_cache
-> 2484 res = cache.get(item)
2485 if res is None:
2486 values = self._data.get(item)
TypeError: unhashable type: 'dict'
In [5]: pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}.keys()]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-030f516d9637> in <module>()
----> 1 pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}.keys()]
~/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
2683 return self._getitem_multilevel(key)
2684 else:
-> 2685 return self._getitem_column(key)
2686
2687 def _getitem_column(self, key):
~/nobackup/repo/pandas/pandas/core/frame.py in _getitem_column(self, key)
2690 # get column
2691 if self.columns.is_unique:
-> 2692 return self._get_item_cache(key)
2693
2694 # duplicate columns & possible reduce dimensionality
~/nobackup/repo/pandas/pandas/core/generic.py in _get_item_cache(self, item)
2482 """Return the cached item, item represents a label indexer."""
2483 cache = self._item_cache
-> 2484 res = cache.get(item)
2485 if res is None:
2486 values = self._data.get(item)
TypeError: unhashable type: 'dict_keys'
Problem description
I know that DataFrame.__getitem__
is a mess ( #9595 ), but I don’t see why dicts and dict keys shouldn’t be just considered list-likes as it happens with Series
.
Expected Output
In [6]: pd.DataFrame(index=range(10), columns=range(10))[list({1:2,3:4})]
Out[6]:
1 3
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-6-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.UTF-8 LOCALE: it_IT.UTF-8
pandas: 0.24.0.dev0+25.gcd0447102 pytest: 3.5.0 pip: 9.0.1 setuptools: 39.2.0 Cython: 0.25.2 numpy: 1.14.3 scipy: 0.19.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.5.6 patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.4 blosc: None bottleneck: 1.2.0dev tables: 3.3.0 numexpr: 2.6.1 feather: 0.3.1 matplotlib: 2.2.2.post1153+gff6786446 openpyxl: 2.3.0 xlrd: 1.0.0 xlwt: 1.3.0 xlsxwriter: 0.9.6 lxml: 4.1.1 bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: 1.0.15 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: 0.2.1
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (11 by maintainers)
Given the discussion above, I am fine with going forward and handling dict here not as a special case but just as any other iterable.
@jorisvandenbossche if you think we need more time to decide, I will change #21313 so that it temporarily keeps the actual behavior