question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DOC:Pandas groupby agg example for k-hot encoding

See original GitHub issue

Location of the documentation

https://pandas.pydata.org/pandas-docs/version/0.22.0/generated/pandas.DataFrame.aggregate.html#pandas.DataFrame.aggregate

[this should provide the location of the documentation, e.g. “pandas.read_csv” or the URL of the documentation, e.g. “https://dev.pandas.io/docs/reference/api/pandas.read_csv.html”]

Note: You can check the latest versions of the docs on master here.

Documentation problem

It is strange that even on stackoverflow reverse explode has only a single answer(that too is asked very recently). So adding a few more examples should not be bad. image

[this should provide a description of what documentation you believe needs to be fixed/improved]

Suggested fix for documentation

The following example tells how to perform k-hot encoding with agg.

image

maximum = df.groupby("ImageId").agg({"ClassId":lambda x:x.tolist()})["ClassId"].max()[0]
df = df.groupby("ImageId").agg({"ClassId":lambda x:x.tolist()})
df["ClassId"] = df["ClassId"].apply(lambda x:[1 if i+1 in x else 0 for i in range(maximum)] )

image

[this should explain the suggested fix and why it’s better than the existing documentation]

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:32 (15 by maintainers)

github_iconTop GitHub Comments

1reaction
Teut2711commented, Oct 31, 2020

Hmm, ok. I already have the dev environment setup so it should not be very difficult to get started.

0reactions
Teut2711commented, Oct 31, 2020

Also can you share the knowledge which function would you use in place of apply for performing that encoding?It would increase my skills.

https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html

I already have seen the docs for get_dummies. I thought you meant some vectorization method to convert [1, 3] to [1, 0, 1, 0]

Read more comments on GitHub >

github_iconTop Results From Across the Web

Comprehensive Guide to Grouping and Aggregating with ...
Pandas groupby and aggregation provide powerful capabilities for summarizing data. This article will discuss basic functionality as well as ...
Read more >
Aggregation on one-hot-encoded dataframes - Stack Overflow
Aggregate. As you requested, score must be aggregated by mean and votes by sum: movies = movies.groupby('genres') ...
Read more >
Grouping and Aggregating with Pandas - GeeksforGeeks
Examples : We use groupby() function to group the data on “Maths” value. It returns the object as result.
Read more >
pandas GroupBy: Your Guide to Grouping Data in Python
Here's an example of grouping jointly on two columns, which finds the count of Congressional members broken out by state and then by...
Read more >
pandas.core.groupby.DataFrameGroupBy.aggregate
Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed func , see the examples below. Examples.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found