question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Do not pass lambdas to the backend in case of basic GroupBy aggregation

See original GitHub issue

Current GroupBy implementation is still passing python lambdas to the backend to do simple aggregation (for example mean): https://github.com/modin-project/modin/blob/89f6cdde56abf5dcf9561ea6721fb31f84a855d5/modin/pandas/groupby.py#L128-L129 These lambdas are not processable by non-python engines, and so even if the backend does support particular aggregation it can’t be executed because it’s passed in a python function format. At the moment, all of those lambdas are being passed into a single method QueryCompiler.groupby_agg.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
vnlitvinovcommented, Dec 7, 2021

Can we go ahead with #3373 then and fix the bloat as a separate refactoring step? It would unblock groupby implementations in OmniSci immediately while we figure how to better split the beast.

2reactions
YarShevcommented, Aug 24, 2021

We likely want to keep QC API as clean as possible and all the implementations should go to a different location. We could move those implementations to kernels folder after #2957 is merged. Something like this:

.pandas/
   query_compiler.py
   .kernels/
      .dataframe.py
      .groupby.py
      .resample.py
      .etc.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas .groupby(), Lambda Function, & Pivot Table Tutorial
This lesson of the Python Tutorial for Data Analysis covers grouping data with pandas .groupby(), using lambda functions and pivot tables, and sorting...
Read more >
pandas GroupBy: Your Guide to Grouping Data in Python
In this tutorial, you'll learn how to work adeptly with the pandas GroupBy facility while mastering ways to manipulate, transform, ...
Read more >
Aggregating lambda functions in pandas and numpy
You need to specify the column in data whose values are to be aggregated. For example, data = data.groupby(['type', 'status', ...
Read more >
Pandas GroupBy - GeeksforGeeks
Groupby concept is really important because it's ability to aggregate data efficiently, both in performance and the amount code is magnificent.
Read more >
Comprehensive Guide to Grouping and Aggregating with ...
Pandas groupby and aggregation provide powerful capabilities for ... The most common built in aggregation functions are basic math functions ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found