Handle Error Policy in OrdinalEncoder
See original GitHub issuePreprocessor class OneHotEncoder
allows transformation if unknown values are found. It would be great to introduce the same option to OrdinalEncoder
. It seems simple to do since OrdinalEncoder
(as well as OneHotEncoder
) is derived from _BaseEncoder
which actually implements handling error policy.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:16
- Comments:26 (14 by maintainers)
Top Results From Across the Web
Error when trying .transform for OrdinalEncoder from Scikit Learn
You are overriding your ord_enc object every time you fit_transform to a new column. You should to apply it to all the relevant...
Read more >OrdinalEncoder — 1.2.0 - Feature-engine - Read the Docs
Indicates what to do when categories not present in the train set are encountered during transform. If 'raise', then rare categories will raise...
Read more >Pipeline OrdinalEncoder ValueError Found unknown categories
Your problem is that the model has encountered a value in the test data that it had not seen in the training data....
Read more >sklearn.preprocessing.OrdinalEncoder
When set to 'error' an error will be raised in case an unknown categorical feature is present during transform. When set to 'use_encoded_value',...
Read more >A Better OrdinalEncoder for Scikit-learn - Frank WorkShop
In this blog, I develop a new Ordinal Encoder which makes up the ... encoder that is able to handle multiple categorical features...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Personally I find this issue really annoying. At the moment we cannot use
OrdinalEncoder
on data with a long-tailed distribution of categorical variable frequencies in a cross validation loops without triggering the unknown category exception at prediction time.Friends, may you speed up this fix since in real live, there are often many differences between train and data test? as stated above by @ogrisel : Personally I find this issue really annoying
Can you do at least this as @daskol wrote above
I could ignore unknown catregiries value as OneHotEncoder does. Another possible scenario (say sentinel) could replace unknown value with default one which could be specified in OrdinalEncoder’s constructor
yes just ignore unknown catregiries for transform - replace unknown value with default one which could be specified in OrdinalEncoder’s constructor or set to none… but please do something!!!