question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.cut() with user-specified labels and bins arguments does not use user-provided labels

See original GitHub issue

Code Sample

import pandas as pd

bins = pd.IntervalIndex.from_tuples([(0, 2), (2, 2.5), (2.5, 5)])
pd.cut([1, 1.5, 2, 2.5, 3, 3.5], bins=bins, labels=['a', 'b', 'c']).tolist()

# output of pd.cut(...) is:
# [Interval(0.0, 2.0, closed='right'), Interval(0.0, 2.0, closed='right'), Interval(0.0, 2.0, closed='right'), Interval(2.0, 2.5, closed='right'), Interval(2.5, 5.0, closed='right'), Interval(2.5, 5.0, closed='right')]

Problem description

When labels are provided as an argument to pd.cut() with user-specified bins, then the output does not use the labels argument.

Expected Output

['a', 'a', 'b', 'b', 'c', 'c']

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.13.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.23.0 pytest: 3.0.5 pip: 10.0.1 setuptools: 39.0.1 Cython: 0.25.2 numpy: 1.14.2 scipy: 1.0.1 pyarrow: None xarray: None IPython: 5.3.0 sphinx: None patsy: 0.5.0 dateutil: 2.7.2 pytz: 2018.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.2.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.6.0 html5lib: 0.9999999 sqlalchemy: None pymysql: 0.7.11.None psycopg2: None jinja2: 2.9.5 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:13
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

7reactions
Ankan1991commented, Mar 25, 2020

Just in case someone is looking for a workaround , you can create a helper dictionary and get the values: For the mentioned list:

d = dict(zip(bins,['a','b','c']))
[d.get(i) 
   for i in pd.cut([1, 1.5, 2, 2.5, 3, 3.5], bins=bins, labels=['a', 'b', 'c']).tolist()]

For a pandas series , you can use series.map:

d = dict(zip(bins,['a','b','c']))
pd.cut(series,bins).map(d)
1reaction
TomAugspurgercommented, May 28, 2018

Did you see the docstring for pd.cut?

labels : array or bool, optional
    Specifies the labels for the returned bins. Must be the same length as
    the resulting bins. If False, returns only integer indicators of the
    bins. This affects the type of the output container (see below).
    This argument is ignored when `bins` is an IntervalIndex.

I’m not sure exactly why, but labels is ignored for II bins.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python: pandas.cut labels are ignored - Stack Overflow
DataFrame using pandas.cut() , but the labels I put into labels argument are not applied. Let me show you an example. I have...
Read more >
pandas.cut — pandas 1.5.2 documentation
This argument is ignored when bins is an IntervalIndex. labelsarray or False, default None. Specifies the labels for the returned bins. Must be...
Read more >
All Pandas cut() you should know for transforming numerical ...
Pandas' built-in cut() function is a great way to transform numerical ... This parameter can be used to allow non-unique labels: pd.cut(
Read more >
PSoC Creator User Guide - Infineon Technologies
Cypress does not assume any liability arising out of the application or use of any ... Notice that the label below the Component...
Read more >
Pandas.cut() method in Python - GeeksforGeeks
Pandas cut() function is used to separate the array elements into different ... Syntax: cut(x, bins, right=True, labels=None, retbins=False, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Hashnode Post

No results found