question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dict/dict keys in DataFrame.__getitem__

See original GitHub issue

Code Sample, a copy-pastable example if possible

In [2]: pd.Series(index=range(10))[{1:2,3:4}]
Out[2]: 
1   NaN
3   NaN
dtype: float64

In [3]: pd.Series(index=range(10))[{1:2,3:4}.keys()]
Out[3]: 
1   NaN
3   NaN
dtype: float64

In [4]: pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-5b29d5d42cf2> in <module>()
----> 1 pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}]

~/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
   2683             return self._getitem_multilevel(key)
   2684         else:
-> 2685             return self._getitem_column(key)
   2686 
   2687     def _getitem_column(self, key):

~/nobackup/repo/pandas/pandas/core/frame.py in _getitem_column(self, key)
   2690         # get column
   2691         if self.columns.is_unique:
-> 2692             return self._get_item_cache(key)
   2693 
   2694         # duplicate columns & possible reduce dimensionality

~/nobackup/repo/pandas/pandas/core/generic.py in _get_item_cache(self, item)
   2482         """Return the cached item, item represents a label indexer."""
   2483         cache = self._item_cache
-> 2484         res = cache.get(item)
   2485         if res is None:
   2486             values = self._data.get(item)

TypeError: unhashable type: 'dict'

In [5]: pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}.keys()]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-030f516d9637> in <module>()
----> 1 pd.DataFrame(index=range(10), columns=range(10))[{1:2,3:4}.keys()]

~/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
   2683             return self._getitem_multilevel(key)
   2684         else:
-> 2685             return self._getitem_column(key)
   2686 
   2687     def _getitem_column(self, key):

~/nobackup/repo/pandas/pandas/core/frame.py in _getitem_column(self, key)
   2690         # get column
   2691         if self.columns.is_unique:
-> 2692             return self._get_item_cache(key)
   2693 
   2694         # duplicate columns & possible reduce dimensionality

~/nobackup/repo/pandas/pandas/core/generic.py in _get_item_cache(self, item)
   2482         """Return the cached item, item represents a label indexer."""
   2483         cache = self._item_cache
-> 2484         res = cache.get(item)
   2485         if res is None:
   2486             values = self._data.get(item)

TypeError: unhashable type: 'dict_keys'

Problem description

I know that DataFrame.__getitem__ is a mess ( #9595 ), but I don’t see why dicts and dict keys shouldn’t be just considered list-likes as it happens with Series.

Expected Output

In [6]: pd.DataFrame(index=range(10), columns=range(10))[list({1:2,3:4})]
Out[6]: 
     1    3
0  NaN  NaN
1  NaN  NaN
2  NaN  NaN
3  NaN  NaN
4  NaN  NaN
5  NaN  NaN
6  NaN  NaN
7  NaN  NaN
8  NaN  NaN
9  NaN  NaN

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-6-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.UTF-8 LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+25.gcd0447102 pytest: 3.5.0 pip: 9.0.1 setuptools: 39.2.0 Cython: 0.25.2 numpy: 1.14.3 scipy: 0.19.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.5.6 patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.4 blosc: None bottleneck: 1.2.0dev tables: 3.3.0 numexpr: 2.6.1 feather: 0.3.1 matplotlib: 2.2.2.post1153+gff6786446 openpyxl: 2.3.0 xlrd: 1.0.0 xlwt: 1.3.0 xlsxwriter: 0.9.6 lxml: 4.1.1 bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: 1.0.15 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: 0.2.1

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Jun 29, 2018

Given the discussion above, I am fine with going forward and handling dict here not as a special case but just as any other iterable.

0reactions
toobazcommented, Jun 20, 2018

@jorisvandenbossche if you think we need more time to decide, I will change #21313 so that it temporarily keeps the actual behavior

Read more comments on GitHub >

github_iconTop Results From Across the Web

Extract dictionary value from column in data frame
I would like to extract element 'Feature3' from dictionaries in column 'dic'(if exist) in above data frame. So far I was able to...
Read more >
Indexing and selecting data — pandas 1.5.2 documentation
A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). See...
Read more >
PySpark Create DataFrame From Dictionary (Dict)
First, let's create data with a list of Python Dictionary (Dict) objects, below example has 2 columns of type String & Dictionary as...
Read more >
pyspark.sql.Column.getItem - Apache Spark
An expression that gets an item at position ordinal out of a list, or gets an item by key out of a dict....
Read more >
Explain the usage of MapType Dict in PySpark in Databricks
This recipe explains what the usage of MapType Dict in PySpark in ... Also, the value of a key from Map using getItem()...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found