question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Handle Error Policy in OrdinalEncoder

See original GitHub issue

Preprocessor class OneHotEncoder allows transformation if unknown values are found. It would be great to introduce the same option to OrdinalEncoder. It seems simple to do since OrdinalEncoder (as well as OneHotEncoder) is derived from _BaseEncoder which actually implements handling error policy.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:16
  • Comments:26 (14 by maintainers)

github_iconTop GitHub Comments

21reactions
ogriselcommented, Mar 27, 2019

Personally I find this issue really annoying. At the moment we cannot use OrdinalEncoder on data with a long-tailed distribution of categorical variable frequencies in a cross validation loops without triggering the unknown category exception at prediction time.

5reactions
Sandy4321commented, Feb 12, 2020

Friends, may you speed up this fix since in real live, there are often many differences between train and data test? as stated above by @ogrisel : Personally I find this issue really annoying

Can you do at least this as @daskol wrote above
I could ignore unknown catregiries value as OneHotEncoder does. Another possible scenario (say sentinel) could replace unknown value with default one which could be specified in OrdinalEncoder’s constructor

yes just ignore unknown catregiries for transform - replace unknown value with default one which could be specified in OrdinalEncoder’s constructor or set to none… but please do something!!!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error when trying .transform for OrdinalEncoder from Scikit Learn
You are overriding your ord_enc object every time you fit_transform to a new column. You should to apply it to all the relevant...
Read more >
OrdinalEncoder — 1.2.0 - Feature-engine - Read the Docs
Indicates what to do when categories not present in the train set are encountered during transform. If 'raise', then rare categories will raise...
Read more >
Pipeline OrdinalEncoder ValueError Found unknown categories
Your problem is that the model has encountered a value in the test data that it had not seen in the training data....
Read more >
sklearn.preprocessing.OrdinalEncoder
When set to 'error' an error will be raised in case an unknown categorical feature is present during transform. When set to 'use_encoded_value',...
Read more >
A Better OrdinalEncoder for Scikit-learn - Frank WorkShop
In this blog, I develop a new Ordinal Encoder which makes up the ... encoder that is able to handle multiple categorical features...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found