question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slicing Datetime MultiIndex with string or datetime.date slices

See original GitHub issue

Basically Indexing a MultiIndex with a DatetimeIndex seems only to be working if you use slices with datetime.datetime or pandas.Timestamp. One would expect it to work also with strings as well as with ‘datetime.date’ slices as it does for usual indizes.

This seems to be related to #3843.

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
import time
import datetime as dt

min_ts = time.mktime(dt.date(2016,10,1).timetuple())
timestamps = [min_ts + offset*86400 + noise*3600 
                  for offset, noise in enumerate(np.random.rand(100))]
time_idx = pd.to_datetime(sorted(timestamps), unit='s')
id_idx = np.random.choice(np.arange(10), 100)
df = pd.DataFrame(np.identity(100),
                 index=pd.MultiIndex.from_arrays(
                     [time_idx, id_idx]
                 ))

df.loc[dt.datetime(2016, 10, 1):]  # works
df.loc[dt.date(2016, 10, 1):]  # fails :/

Problem description

The above code raises the following exception on the last line, which is quite unexpected when one is used to indexing on single index dataframes.

Traceback (most recent call last):
  File "<input>", line 11, in <module>
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/core/indexing.py", line 1312, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/core/indexing.py", line 1453, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/core/indexing.py", line 1334, in _get_slice_axis
    slice_obj.step, kind=self.name)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/indexes/base.py", line 2997, in slice_indexer
    kind=kind)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/indexes/multi.py", line 1578, in slice_locs
    return super(MultiIndex, self).slice_locs(start, end, step, kind=kind)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/indexes/base.py", line 3176, in slice_locs
    start_slice = self.get_slice_bound(start, 'left', kind)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/indexes/multi.py", line 1549, in get_slice_bound
    return self._partial_tup_index(label, side=side)
  File "/Users/kayibal/virtualenvs/traildb-sparse/lib/python3.5/site-packages/pandas/indexes/multi.py", line 1594, in _partial_tup_index
    raise TypeError('Level type mismatch: %s' % lab)
TypeError: Level type mismatch: 2016-10-01

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.19.2 nose: None pip: 9.0.1 setuptools: 34.2.0 Cython: 0.25.2 numpy: 1.12.0 scipy: 0.18.1 statsmodels: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: None tables: None numexpr: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None boto: None pandas_datareader: None

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:2
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
rmongecacommented, Oct 14, 2020

I see this issue is still open. I have a similar problem on pandas 1.1.1:

df = pd.DataFrame({'a': ['2020-01-01', '2020-02-01', '2020-03-01'], 'b': [1, 2, 3], 'c': [10, 20, 30]})
df['a'] = pd.to_datetime(df['a'])
df = df.set_index(['a', 'b'])
df.loc[('2020-01-01', 1)]  # This works.
df.loc[[('2020-01-01', 1), ('2020-02-01', 2)]]  # This fails.
df.loc[[(pd.Timestamp(2020, 1, 1), 1), (pd.Timestamp(2020, 2, 1), 2)]]  # This works.

So, if you have a multiindex dataframe with a datetime index, you can access the datetime index as a string only if you extract one row. If you try to extract more than one, dates have to be given as timestamps.

I know this is not a very common scenario, but is there any plan to solve it, or is there any workaround? Thank you!

Just to build on what @pabloarosado has commented, by using datetime.datetime or datetime.date the multindex slicing also seems to work:

df.loc[[(datetime.datetime(2020, 1, 1), 1), (datetime.datetime(2020, 2, 1), 2)]]  # This works
df.loc[[(datetime.date(2020, 1, 1), 1), (datetime.date(2020, 2, 1), 2)]]  # This also works
2reactions
pabloarosadocommented, Oct 14, 2020

I see this issue is still open. I have a similar problem on pandas 1.1.1:

df = pd.DataFrame({'a': ['2020-01-01', '2020-02-01', '2020-03-01'], 'b': [1, 2, 3], 'c': [10, 20, 30]})
df['a'] = pd.to_datetime(df['a'])
df = df.set_index(['a', 'b'])
df.loc[('2020-01-01', 1)]  # This works.
df.loc[[('2020-01-01', 1), ('2020-02-01', 2)]]  # This fails.
df.loc[[(pd.Timestamp(2020, 1, 1), 1), (pd.Timestamp(2020, 2, 1), 2)]]  # This works.

So, if you have a multiindex dataframe with a datetime index, you can access the datetime index as a string only if you extract one row. If you try to extract more than one, dates have to be given as timestamps.

I know this is not a very common scenario, but is there any plan to solve it, or is there any workaround? Thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas Dataframe datetime slicing with Index vs MultiIndex
Cross-section should work: df.xs(slice('2016-01-01', '2016-01-01'), level='date').
Read more >
error slicing multi-Index with dates - Google Groups
It is the 'parsing' of non-datetime objects (strings, dates) in slicing of muti-indexes that is not implemented (the issue Denis linked to). Regards,....
Read more >
slicing a pandas multiindex using datetime datatype-Pandas ...
[Solved]-slicing a pandas multiindex using datetime datatype-Pandas,Python. Search. score:0 ... You can slice on the Timestamps rather than the strings:
Read more >
MultiIndex / advanced indexing — pandas 1.5.2 documentation
You can slice a MultiIndex by providing multiple indexers. You can provide any of the selectors as if you are indexing by label,...
Read more >
Pandas Extract Month and Year from Datetime
You can extract month and year from the DateTime (date) column in pandas in several ways. In this article, I will explain how...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found