Make parameter keep=False keep duplicates for nlargest/nsmallest
See original GitHub issueCode Sample, a copy-pastable example if possible
>>> s = pd.Series([10,9,8,7,7,7,6])
>>> s.nlargest(4)
0 10
1 9
2 8
3 7
dtype: int64
Problem description
The docstrings list False
as one of the possible argument values for keep
. pandas raises a ValueError
when attempting to use this parameter.
Expected Output
It would be nice to have nlargest work like this.
>>> s.nlargest(4, keep=False)
0 10
1 9
2 8
3 7
4 7
5 7
Output of pd.show_versions()
# Paste the output here pd.show_versions() here
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.2 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.25.2 numpy: 1.13.0 scipy: 0.19.0 xarray: None IPython: 6.0.0 sphinx: 1.5.5 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999999999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.3.0.post
Issue Analytics
- State:
- Created 6 years ago
- Comments:12 (8 by maintainers)
Top Results From Across the Web
pandas.DataFrame.nlargest — pandas 1.5.2 documentation
Parameters. nint ... In the following example, we will use nlargest to select the three rows ... When using keep='all' , all duplicate...
Read more >Become a pandas ninja with nlargest(), nsmallest(), query and ...
You can add keep parameter to tell nlargest() method how to do in front of duplicate rows. The default value for keep is...
Read more >Python | Pandas DataFrame.nlargest() - GeeksforGeeks
Pandas nlargest() method is used to get n largest values from a data frame ... keep: object to set which value to select...
Read more >pandas drop_duplicates() "keep" parameter gives very ...
keep defines which duplicate value you want to keep. 1) First specifies to keep the first duplicate value and drop the rest.
Read more >Powerful One-liners in Pandas Every Data Scientist Should ...
However, the keep argument used in nlargest() makes all the difference. Considering the example above, nlargest() with keep=”all" returns potential duplicates ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@gfyoung This doesn’t make sense to me. The
keep
option is only important whenever there are ties for the last value. Arguably, the most useful thing to do is to keep all of the ties which would bekeep=False
. This option still exists in the docstrings in 0.21.Also, the
duplicated
method has same three options so it matches that. Additionally, it was just recently removed without discussion.edit: A better value for the parameter would be
keep='all'
it would be useful also to have a keep option of “none”!
s = pd.Series([10,9,8,7,7,7,6]) s.nlargest(4) 0 10 1 9 2 8 dtype: int64