Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement groupby for continuous dimensions

See original GitHub issue

HoloViews supports a groupby operation for discrete/categorical dimensions, but as far as I can see there is no support for grouping over a continuous dimension, which requires a specified bin width. The xarray interface might provide this already (https://github.com/ioam/holoviews/issues/804), but for pandas the separate cut method would seem to be needed. Philipp suggests adding a method:

def groupby_bin(dataset, dimension, bins=10):
    dimension = dataset.get_dimension(dimension)
    values = dataset.dimension_values(dimension)
    other_dims = [d for d in dataset.kdims if d is not dimension]
    cats, bins = pd.cut(values, bins, retbins=True)
    hmap = hv.HoloMap(kdims=[dimension])
    for i in range(1, len(bins)):
        start, end = bins[i-1], bins[i]
        mid = np.mean([start, end])
        hmap[mid] = dataset.select(**{dimension.name: (start, end)}).reindex(other_dims)
    return hmap

but I have not tested this.

Issue Analytics

State:
Created 7 years ago
Reactions:1
Comments:13 (12 by maintainers)

Top GitHub Comments

1reaction

TylerTCFcommented, May 1, 2018

Would this resample allow users to take a datetime kdim and aggregate to a different sampling frequency like hourly measurements being aggregated into daily, monthly, yearly intervals?

0reactions

philippjfrcommented, Mar 23, 2020

This can now technically be done with the new transform method:

ds = hv.Dataset(np.random.randn(1000, 3), ['x', 'y'], 'z')
ds.transform(x=hv.dim('x').bin(np.linspace(-1, 1, 11))).groupby('x').apply(hv.Scatter)

Top Results From Across the Web

Division of multiple dimension data in pandas using groupby

Since pandas can't work in multi-dimensions, I usually stack the data row-wise and use a dummy column to mark the data dimensions. Now,...

4 Pandas GroupBy Tricks You Should Know | Medium

Python Pandas Groupby and aggregation functions rename columns, size and count, customising agg functions and cut in bins for data EDA jobs.

Pandas: Conditionally Grouping Values - AskPython

In this article, we'll be conditionally grouping values with Pandas. We've already covered the Python Pandas groupby in detail.

Bucketing Continuous Variables in pandas - Ben Alex Keen

In this post we look at bucketing (also known as binning) continuous data into discrete chunks to be used as ordinal categorical variables....

Group by: split-apply-combine — pandas 1.5.2 documentation

Some operations on the grouped data might not fit into either the aggregate or transform categories. Or, you may simply want GroupBy to...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Implement groupby for continuous dimensions

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

feature request: scale extents a bit more than the absolute fit

Aspect on semi-log plots