Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

GroupBy like API for resample

See original GitHub issue

Since we wrote resample in xarray, pandas updated resample to have a groupyby-like API (e.g., df.resample('24H').mean() vs. the old df.resample('24H') that uses the mean by default).

It would be nice to redo the xarray resample API to match, e.g., ds.resample(time='24H').mean() vs ds.resample('time', '24H'). This would solve a few use cases, including grouped-resample arithmetic, iterating over groups and (mostly) take care of the need for pd.TimeGrouper support (https://github.com/pydata/xarray/issues/364). If we use **kwargs for matching dimension names, this could be done with a minimally painful deprecation cycle.

Issue Analytics

State:
Created 7 years ago
Reactions:3
Comments:6 (1 by maintainers)

Top GitHub Comments

1reaction

darothencommented, Feb 15, 2017

@MaximilianR Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with GroupBy and the classes that compose it, like DatasetGroupBy… basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like .first() and .last(), which is done under the current implementation.

But… if there’s a simpler way, that might be preferable!

0reactions

shoyercommented, Feb 15, 2017

I think this could be done with minimal GroupBy subclasses to supply the default dimension argument for aggregation functions. All the machinery on groupby should already be there. On Wed, Feb 15, 2017 at 10:59 AM Daniel Rothenberg notifications@github.com wrote:

@MaximilianR https://github.com/MaximilianR Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with GroupBy https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L165 and the classes that compose it, like DatasetGroupBy https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L586… basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like .first() and .last(), which is done under the current implementation.

But… if there’s a simpler way, that might be preferable!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1269#issuecomment-280104546, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1mAUBUkz7ig3fijFmqg6IeDnGgdeks5rc0sJgaJpZM4MAyE5 .

Top Results From Across the Web

pandas.core.groupby.DataFrameGroupBy.resample

Provide resampling when using a TimeGrouper. Given a grouper, the function resamples it according to a string “string” -> “frequency”. See the frequency ......

Pandas: resample timeseries with groupby

I would like resample the data to aggregate it hourly by count while grouping by location to produce a data frame that looks...

Grouping and Sampling Time Series Data | by Shelvi Garg

We will solve these using only 2 Pandas APIs i.e. resample() and GroupBy(). Resample():. The resample() function is used to resample time-series data....

How to group data by time intervals in Python Pandas?

This is similar to resample(), so whatever we discussed above applies here as well. We added store_type to the groupby so that for...

cudf.DataFrame.resample — cudf 22.12.00 documentation

For a DataFrame, column to use instead of the index for resampling. Column must be a datetime-like. level: str or int, optional.