Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Do not pass lambdas to the backend in case of basic GroupBy aggregation

See original GitHub issue

Current GroupBy implementation is still passing python lambdas to the backend to do simple aggregation (for example mean): https://github.com/modin-project/modin/blob/89f6cdde56abf5dcf9561ea6721fb31f84a855d5/modin/pandas/groupby.py#L128-L129 These lambdas are not processable by non-python engines, and so even if the backend does support particular aggregation it can’t be executed because it’s passed in a python function format. At the moment, all of those lambdas are being passed into a single method QueryCompiler.groupby_agg.

Issue Analytics

State:
Created 2 years ago
Comments:11 (11 by maintainers)

Top GitHub Comments

2reactions

vnlitvinovcommented, Dec 7, 2021

Can we go ahead with #3373 then and fix the bloat as a separate refactoring step? It would unblock groupby implementations in OmniSci immediately while we figure how to better split the beast.

2reactions

YarShevcommented, Aug 24, 2021

We likely want to keep QC API as clean as possible and all the implementations should go to a different location. We could move those implementations to kernels folder after #2957 is merged. Something like this:

.pandas/
   query_compiler.py
   .kernels/
      .dataframe.py
      .groupby.py
      .resample.py
      .etc.

Top Results From Across the Web

Pandas .groupby(), Lambda Function, & Pivot Table Tutorial

This lesson of the Python Tutorial for Data Analysis covers grouping data with pandas .groupby(), using lambda functions and pivot tables, and sorting...

pandas GroupBy: Your Guide to Grouping Data in Python

In this tutorial, you'll learn how to work adeptly with the pandas GroupBy facility while mastering ways to manipulate, transform, ...

Aggregating lambda functions in pandas and numpy

You need to specify the column in data whose values are to be aggregated. For example, data = data.groupby(['type', 'status', ...

Pandas GroupBy - GeeksforGeeks

Groupby concept is really important because it's ability to aggregate data efficiently, both in performance and the amount code is magnificent.

Comprehensive Guide to Grouping and Aggregating with ...

Pandas groupby and aggregation provide powerful capabilities for ... The most common built in aggregation functions are basic math functions ...