question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make parameter keep=False keep duplicates for nlargest/nsmallest

See original GitHub issue

Code Sample, a copy-pastable example if possible

>>> s = pd.Series([10,9,8,7,7,7,6])
>>> s.nlargest(4)
0    10
1     9
2     8
3     7
dtype: int64

Problem description

The docstrings list False as one of the possible argument values for keep. pandas raises a ValueError when attempting to use this parameter.

Expected Output

It would be nice to have nlargest work like this.

>>> s.nlargest(4, keep=False)
0    10
1     9
2     8
3     7
4     7
5     7

Output of pd.show_versions()

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.20.2 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.25.2 numpy: 1.13.0 scipy: 0.19.0 xarray: None IPython: 6.0.0 sphinx: 1.5.5 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999999999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.3.0.post

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
tdpetroucommented, Dec 6, 2017

@gfyoung This doesn’t make sense to me. The keep option is only important whenever there are ties for the last value. Arguably, the most useful thing to do is to keep all of the ties which would be keep=False. This option still exists in the docstrings in 0.21.

Also, the duplicated method has same three options so it matches that. Additionally, it was just recently removed without discussion.

edit: A better value for the parameter would be keep='all'

0reactions
artinthetreescommented, Jan 28, 2021

it would be useful also to have a keep option of “none”!

s = pd.Series([10,9,8,7,7,7,6]) s.nlargest(4) 0 10 1 9 2 8 dtype: int64

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.DataFrame.nlargest — pandas 1.5.2 documentation
Parameters. nint ... In the following example, we will use nlargest to select the three rows ... When using keep='all' , all duplicate...
Read more >
Become a pandas ninja with nlargest(), nsmallest(), query and ...
You can add keep parameter to tell nlargest() method how to do in front of duplicate rows. The default value for keep is...
Read more >
Python | Pandas DataFrame.nlargest() - GeeksforGeeks
Pandas nlargest() method is used to get n largest values from a data frame ... keep: object to set which value to select...
Read more >
pandas drop_duplicates() "keep" parameter gives very ...
keep defines which duplicate value you want to keep. 1) First specifies to keep the first duplicate value and drop the rest.
Read more >
Powerful One-liners in Pandas Every Data Scientist Should ...
However, the keep argument used in nlargest() makes all the difference. Considering the example above, nlargest() with keep=”all" returns potential duplicates ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found