Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrame.loc[n] = dict(..) fails with some type combinations

See original GitHub issue

Code Sample, a copy-pastable example if possible

This one fails:

# Your code here
In [9]: d = pd.DataFrame(columns=['time', 'value'])                    
In [9]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-b557eb950858> in <module>()
----> 1 d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    177             key = com._apply_if_callable(key, self.obj)
    178         indexer = self._get_setitem_indexer(key)
--> 179         self._setitem_with_indexer(indexer, value)
    180 
    181     def _has_valid_type(self, k, axis):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    423                                        name=indexer)
    424 
--> 425                     self.obj._data = self.obj.append(value)._data
    426                     self.obj._maybe_update_cacher(clear=True)
    427                     return self.obj

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in append(self, other, ignore_index, verify_integrity)
   4628             other = DataFrame(other.values.reshape((1, len(other))),
   4629                               index=index,
-> 4630                               columns=combined_columns)
   4631             other = other._convert(datetime=True, timedelta=True)
   4632             if not self.columns.equals(combined_columns):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    304             else:
    305                 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 306                                          copy=copy)
    307         elif isinstance(data, (list, types.GeneratorType)):
    308             if isinstance(data, types.GeneratorType):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in _init_ndarray(self, values, index, columns, dtype, copy)
    481             values = maybe_infer_to_datetimelike(values)
    482 
--> 483         return create_block_manager_from_blocks([values], [columns, index])
    484 
    485     @property

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_blocks(blocks, axes)
   4294                                      placement=slice(0, len(axes[0])))]
   4295 
-> 4296         mgr = BlockManager(blocks, axes)
   4297         mgr._consolidate_inplace()
   4298         return mgr

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2790                     raise AssertionError('Number of Block dimensions (%d) '
   2791                                          'must equal number of axes (%d)' %
-> 2792                                          (block.ndim, self.ndim))
   2793 
   2794         if do_integrity_check:

AssertionError: Number of Block dimensions (1) must equal number of axes (2)

But this one succeeds:

In [11]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value=5)

In [12]: d
Out[12]: 
      time value
0 00:00:05     5

This one also succeeds:

In [13]: d = pd.DataFrame(columns=['time', 'value'])

In [14]: d.loc[0] = dict(time=3, value='foo')

In [15]: d
Out[15]: 
  time value
0    3   foo

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

The current behavior is a problem because it is inconsistent, and depends on the type of data provided. Mixing timedelta with str fails, but timedelta with int works, as does int with str.

I believe this is related to aggressive type inference previously noted in #13829.

Expected Output

Not crashing.

Output of `pd.show_versions()`

In [16]: pd.show_versions() /home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime

INSTALLED VERSIONS

commit: None python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-77-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.20.1 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: 0.9.5 IPython: 6.0.0 sphinx: 1.5.5 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: None numexpr: 2.6.0 feather: None matplotlib: 2.0.1 openpyxl: None xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 sqlalchemy: 1.0.9 pymysql: None psycopg2: None jinja2: 2.9.5 s3fs: 0.1.0 pandas_gbq: None pandas_datareader: None