question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add 'most_frequent' drop method to OneHotEncoder

See original GitHub issue

I would like to propose adding a ‘most_frequent’ method as one of the drop parameter options in OneHotEncoder.

I find that using the most frequent value as the reference level aids with interpreting the newly created OHE features. The ‘first’ method is not very intuitive.

I would also be helpful if a dropped_levels_ attribute was included, instead of having to derive it from the categories_ and drop_idx_ attributes. Thanks

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jnothmancommented, Oct 20, 2020

Makes sense. I’d be happy to see this.

1reaction
trewaitecommented, Oct 19, 2020

I would find this quite useful as well. I am willing to make a pull request if core development team approves.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.preprocessing.OneHotEncoder
Specifies a methodology to use to drop one of the categories per feature. This is useful in situations where perfectly collinear features cause...
Read more >
OneHotEncoder — 1.3.0 - Feature-engine
We can drop automatically the last dummy variable for those variables ... the transformer will add binary variables only for the 6 most...
Read more >
How to Perform One-Hot Encoding For Multi Categorical ...
Technique For Multi Categorical Variables. The technique is that we will limit one-hot encoding to the 10 most frequent labels of the variable....
Read more >
Input contains NaN when onehotencoding | Data Science and ...
I tried to drop columns with missing values, and get this error: ... step1 : impute x_test using mostfrequent method, This will remove...
Read more >
Feature Engineering-How to Perform One Hot Encoding for ...
Hi All,After Completing this video you will understand how we can perform One hot Encoding for Multi Categorical Features.amazon url: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found