question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement groupby for continuous dimensions

See original GitHub issue

HoloViews supports a groupby operation for discrete/categorical dimensions, but as far as I can see there is no support for grouping over a continuous dimension, which requires a specified bin width. The xarray interface might provide this already (https://github.com/ioam/holoviews/issues/804), but for pandas the separate cut method would seem to be needed. Philipp suggests adding a method:

def groupby_bin(dataset, dimension, bins=10):
    dimension = dataset.get_dimension(dimension)
    values = dataset.dimension_values(dimension)
    other_dims = [d for d in dataset.kdims if d is not dimension]
    cats, bins = pd.cut(values, bins, retbins=True)
    hmap = hv.HoloMap(kdims=[dimension])
    for i in range(1, len(bins)):
        start, end = bins[i-1], bins[i]
        mid = np.mean([start, end])
        hmap[mid] = dataset.select(**{dimension.name: (start, end)}).reindex(other_dims)
    return hmap

but I have not tested this.

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:1
  • Comments:13 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
TylerTCFcommented, May 1, 2018

Would this resample allow users to take a datetime kdim and aggregate to a different sampling frequency like hourly measurements being aggregated into daily, monthly, yearly intervals?

0reactions
philippjfrcommented, Mar 23, 2020

This can now technically be done with the new transform method:

ds = hv.Dataset(np.random.randn(1000, 3), ['x', 'y'], 'z')
ds.transform(x=hv.dim('x').bin(np.linspace(-1, 1, 11))).groupby('x').apply(hv.Scatter)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Division of multiple dimension data in pandas using groupby
Since pandas can't work in multi-dimensions, I usually stack the data row-wise and use a dummy column to mark the data dimensions. Now,...
Read more >
4 Pandas GroupBy Tricks You Should Know | Medium
Python Pandas Groupby and aggregation functions rename columns, size and count, customising agg functions and cut in bins for data EDA jobs.
Read more >
Pandas: Conditionally Grouping Values - AskPython
In this article, we'll be conditionally grouping values with Pandas. We've already covered the Python Pandas groupby in detail.
Read more >
Bucketing Continuous Variables in pandas - Ben Alex Keen
In this post we look at bucketing (also known as binning) continuous data into discrete chunks to be used as ordinal categorical variables....
Read more >
Group by: split-apply-combine — pandas 1.5.2 documentation
Some operations on the grouped data might not fit into either the aggregate or transform categories. Or, you may simply want GroupBy to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found