pd.cut() with user-specified labels and bins arguments does not use user-provided labels
See original GitHub issueCode Sample
import pandas as pd
bins = pd.IntervalIndex.from_tuples([(0, 2), (2, 2.5), (2.5, 5)])
pd.cut([1, 1.5, 2, 2.5, 3, 3.5], bins=bins, labels=['a', 'b', 'c']).tolist()
# output of pd.cut(...) is:
# [Interval(0.0, 2.0, closed='right'), Interval(0.0, 2.0, closed='right'), Interval(0.0, 2.0, closed='right'), Interval(2.0, 2.5, closed='right'), Interval(2.5, 5.0, closed='right'), Interval(2.5, 5.0, closed='right')]
Problem description
When labels are provided as an argument to pd.cut() with user-specified bins, then the output does not use the labels argument.
Expected Output
['a', 'a', 'b', 'b', 'c', 'c']
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 2.7.13.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None
pandas: 0.23.0 pytest: 3.0.5 pip: 10.0.1 setuptools: 39.0.1 Cython: 0.25.2 numpy: 1.14.2 scipy: 1.0.1 pyarrow: None xarray: None IPython: 5.3.0 sphinx: None patsy: 0.5.0 dateutil: 2.7.2 pytz: 2018.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.2.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.6.0 html5lib: 0.9999999 sqlalchemy: None pymysql: 0.7.11.None psycopg2: None jinja2: 2.9.5 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Reactions:13
- Comments:5 (3 by maintainers)
Just in case someone is looking for a workaround , you can create a helper dictionary and get the values: For the mentioned list:
For a pandas series , you can use
series.map
:Did you see the docstring for
pd.cut
?I’m not sure exactly why, but labels is ignored for II
bins
.