question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`Series.resample().nlargest` produces incorrect output

See original GitHub issue

Code Sample, a copy-pastable example if possible

With this setup:

import numpy as np
n = 1000
dates = pd.date_range(start='2010-01-01', periods=n)
rain_random = pd.Series(data=np.random.uniform(size=n), index=dates)

these two operations given different results:

rain_random.groupby(rain_random.index.year).nlargest(3)
rain_random.resample('A').nlargest(3)

Problem description

The Series.resample().nlargest() operation is inconsistent with DataFrame.resample()[column].nlargest() and the groupby equivalent. It emits a warning

Output:

/Users/schofield/miniconda/envs/py36/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: 
.resample() is now a deferred operation
You called nlargest(...) on this deferred object which materialized it into a series
by implicitly taking the mean.  Use .resample(...).mean() instead
  """Entry point for launching an IPython kernel.
Out[427]:
2010-12-31    0.507550
2012-12-31    0.490082
2011-12-31    0.478356
dtype: float64

Expected output:

Date        Date      
1930-12-31  1930-10-06      288.135370
            1930-10-05      285.587734
            1930-10-07      259.439935
            1930-10-08      227.587389
            1930-10-09      190.054844
1931-12-31  1931-01-26     3052.104566
            1931-01-25     2839.126102
            1931-01-29     2196.167129
            1931-02-01     1953.331709
            1931-01-27     1893.975328
1932-12-31  1932-01-19     9526.953864
            1932-01-20     4278.291105
            1932-03-03     2952.348903
            1932-03-02     2946.385433
            1932-03-04     2098.108897

pd.show_versions() output:

INSTALLED VERSIONS

commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 16.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8

pandas: 0.20.1 pytest: 3.0.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.6.3 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.5.0

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:1
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jrebackcommented, Jun 8, 2020

pandas and virtually all open source project are all volunteer

the core team will review pull requests

since there are 3000+ open issue most patches must come from the community

issues get fixed when folks like you open pull requests

1reaction
jrebackcommented, Jun 7, 2020

pull requests are accepted; this is how issues get addressed in open source

Read more comments on GitHub >

github_iconTop Results From Across the Web

python 3.x - Pandas resample() Series giving incorrect indexes
Resample is a tricky function. The main issue with the resampling is that you need to select which value you want to keep...
Read more >
10 Resampling — Pandas Doc - GitHub Pages
.resample() is a time-based groupby, followed by a reduction method on each ... or numpy array function that takes an array and produces...
Read more >
pandas.Series.resample — pandas 1.5.2 documentation
Convenience method for frequency conversion and resampling of time series. The object must have a datetime-like index ( DatetimeIndex , PeriodIndex , or ......
Read more >
pandas GroupBy: Your Guide to Grouping Data in Python
In this tutorial, you'll learn how to work adeptly with the pandas GroupBy facility while mastering ways to manipulate, transform, ...
Read more >
Pandas Grouper and Agg Functions Explained
I was recently working on a problem and noticed that pandas had a ... has robust capabilities to manipulate and summarize time series...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found