DataFrame.loc[n] = dict(..) fails with some type combinations
See original GitHub issueCode Sample, a copy-pastable example if possible
This one fails:
# Your code here
In [9]: d = pd.DataFrame(columns=['time', 'value'])
In [9]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-9-b557eb950858> in <module>()
----> 1 d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
177 key = com._apply_if_callable(key, self.obj)
178 indexer = self._get_setitem_indexer(key)
--> 179 self._setitem_with_indexer(indexer, value)
180
181 def _has_valid_type(self, k, axis):
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
423 name=indexer)
424
--> 425 self.obj._data = self.obj.append(value)._data
426 self.obj._maybe_update_cacher(clear=True)
427 return self.obj
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in append(self, other, ignore_index, verify_integrity)
4628 other = DataFrame(other.values.reshape((1, len(other))),
4629 index=index,
-> 4630 columns=combined_columns)
4631 other = other._convert(datetime=True, timedelta=True)
4632 if not self.columns.equals(combined_columns):
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
304 else:
305 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 306 copy=copy)
307 elif isinstance(data, (list, types.GeneratorType)):
308 if isinstance(data, types.GeneratorType):
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in _init_ndarray(self, values, index, columns, dtype, copy)
481 values = maybe_infer_to_datetimelike(values)
482
--> 483 return create_block_manager_from_blocks([values], [columns, index])
484
485 @property
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_blocks(blocks, axes)
4294 placement=slice(0, len(axes[0])))]
4295
-> 4296 mgr = BlockManager(blocks, axes)
4297 mgr._consolidate_inplace()
4298 return mgr
/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
2790 raise AssertionError('Number of Block dimensions (%d) '
2791 'must equal number of axes (%d)' %
-> 2792 (block.ndim, self.ndim))
2793
2794 if do_integrity_check:
AssertionError: Number of Block dimensions (1) must equal number of axes (2)
But this one succeeds:
In [11]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value=5)
In [12]: d
Out[12]:
time value
0 00:00:05 5
This one also succeeds:
In [13]: d = pd.DataFrame(columns=['time', 'value'])
In [14]: d.loc[0] = dict(time=3, value='foo')
In [15]: d
Out[15]:
time value
0 3 foo
Problem description
[this should explain why the current behaviour is a problem and why the expected output is a better solution.]
The current behavior is a problem because it is inconsistent, and depends on the type of data provided. Mixing timedelta
with str
fails, but timedelta
with int
works, as does int
with str
.
I believe this is related to aggressive type inference previously noted in #13829.
Expected Output
Not crashing.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-77-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.20.1 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: 0.9.5 IPython: 6.0.0 sphinx: 1.5.5 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: None numexpr: 2.6.0 feather: None matplotlib: 2.0.1 openpyxl: None xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 sqlalchemy: 1.0.9 pymysql: None psycopg2: None jinja2: 2.9.5 s3fs: 0.1.0 pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
Ah, yes, I see that as well (so when the label already exists). This already is raising in 0.19.2, so not a new bug …
Works now, setting once and setting twice