Setting 'user_splits_fixed' in categorical binning
See original GitHub issueThere is a problem when I try to setting value to user_splits_fixed
. Suppose I have column with raito of event rate in each value like this:
value | raito |
---|---|
-1 | 0.011665 |
2 | 0 |
3 | 0.0133333 |
4 | 0.166667 |
7 | 0 |
8 | 0.0246041 |
9 | 0 |
10 | 0.025641 |
Then when I set user_splits = [[ 2., 7., 9., 3., 10., 4.],[8],[-1]]
,user_splits_fixed=[True, True, True]
, monotonic_trend=None
,dtype='categorical'
and the program raises error ValueError: Fixed user_splits [list([2.0, 7.0, 9.0, 3.0, 10.0, 4.0])] are removed because produce pure prebins. Provide different splits to be fixed.
. What thing is wrong here?
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Feature Engineering Examples: Binning Categorical Features
How to use NumPy or Pandas to quickly bin categorical features ... different categories, you're basically adding 49 columns to your dataset.
Read more >Optimal binning methods for categorical variables
I'm running a multinomial logit to predict the outcome of a categoric response variable. I have both continuous and categoric independent ...
Read more >A guide to binning data with python (numeric and categorical)
In this video, we discuss binning data with python using some nice python pandas functionality. We start by binning categorical data with ...
Read more >3.6 Convert numeric to categorical by binning - Bookdown
Use cut() to set the bin boundaries. The combination of include.lowest = T and right = F results in bins of the form...
Read more >Tutorial: optimal binning with binary target
Also, for this particular example, we set a cat_cutoff=0.1 to create bin others with ... Note that for categorical variables the optimal bins...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I created this example to reproduce your problem:
After commit a6d015b2e9365ecc05cba48421972c562f7960c7, it must work as expected.
Hi @guillermo-navas-palencia The splits result with default option of
OptimalBinning
is[[ 2., 7., 9., 3., 10., 4. 8],[-1]]
. When I parse this splits as an option intouser_splits
parameter, it return the same error 😃.