Scalar fit_params no longer handled. Was: Singleton array (insert value here) cannot be considered a valid collection.
See original GitHub issueDescription
TypeError: Singleton array array(True) cannot be considered a valid collection.
Steps/Code to Reproduce
Found when running RandomizedSearchCV with LightGBM. Previously worked fine. Latest update requires that all the **fit_params be checked for ‘slicability’. Difficult when some fit params are things like early_stopping_rounds = 5.
#Import the modules
import lightgbm as lgb
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import GridSearchCV
#Create parameters grid
#Create fixed parameters
mod_fixed_params = {
'boosting_type':'gbdt'
,'random_state':0
,'silent':False
,'objective':'multiclass'
,'num_class':np.unique(y_train)
,'min_samples_split':200 #Should be between 0.5-1% of samples
,'min_samples_leaf':50
,'subsample':0.8
}
search_params = {
'fixed':{
'cv':3
,'n_iter':80
,'verbose':True
,'random_state':0
}
,'variable':{
'learning_rate':[0.1,0.01,0.005]
,'num_leaves':np.linspace(10,1010,100,dtype=int)
,'max_depth':np.linspace(2,22,10,dtype=int)
}
}
fit_params = {
'verbose':True
,'eval_set':[(X_valid,y_valid)]
,'eval_metric':lgbm_custom_loss
,'early_stopping_rounds':5
}
#Setup the model
lgb_mod = lgb.LGBMClassifier(**mod_fixed_params)
#Add the search grid
seed = np.random.seed(0)
gbm = RandomizedSearchCV(lgb_mod,search_params['variable'],**search_params['fixed'])
#Fit the model
gbm.fit(X_train,y_train,**fit_params)
print('Best parameters found by grid search are: {}'.format(gbm.best_params_))
I’ve traced the error through and it starts in model_selection/_search.py ln652
–>
Expected Results
Expected to run the LightGBM wthrough RandomSearchGrid
Actual Results
TypeError: Singleton array array(True) cannot be considered a valid collection.
Versions
Issue Analytics
- State:
- Created 4 years ago
- Comments:17 (13 by maintainers)
Top Results From Across the Web
TypeError: Singleton array array(True) cannot be considered a ...
198 # Check that shape is returning an integer or default to len TypeError: Singleton array array(True) cannot be considered a valid collection....
Read more >"TypeError: Singleton array cannot be considered a ... - YouTube
Pandas : "TypeError: Singleton array cannot be considered a valid collection " using sklearn train_test_split [ Beautify Your Computer ...
Read more >Arrays - Manual - PHP
An array in PHP is actually an ordered map. A map is a type that associates values to keys. This type is optimized...
Read more >Apply function to each element of array - MATLAB arrayfun
This MATLAB function applies the function func to the elements of A, one element at a time.
Read more >Perl - Arrays - Tutorialspoint
An array is a variable that stores an ordered list of scalar values. Array variables are preceded by an "at" (@) sign. To...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It’s been raised 2-3 times in the couple of weeks since 0.22 was released, and not before.
In 0.21.X:
https://github.com/scikit-learn/scikit-learn/blob/ee328faa3601b40944ad43e28bce71860d39f2de/sklearn/model_selection/_search.py#L630-L632
in 0.22.X
https://github.com/scikit-learn/scikit-learn/blob/bf24c7e3d6d768dddbfad3c26bb3f23bc82c0a18/sklearn/model_selection/_search.py#L650-L654
It does. But we have tacitly supported this behaviour for many, many releases and have changed the behaviour without warning. The support is more than tacit in the sense that
_fit_and_score
explicitly makes use of a helper that bypasses fit params that are not samplewise:https://github.com/scikit-learn/scikit-learn/blob/bf24c7e3d6d768dddbfad3c26bb3f23bc82c0a18/sklearn/model_selection/_validation.py#L940-L944
Thus the previous behaviour could be understood as supported and intended behaviour, even though it was untested (with respect to search at least).
Yes, we can change behaviour around things that do not conform to our conventions, but the change was introduced by @amueller in #14702 and was incidental to that PR. If we are going to change our handling of popular if non-conforming estimators, it should be done intentionally, and incidental changes should indeed be reverted in patch releases, IMO.
Let’s not deprecate non-aligned fit_params just yet. We need to carefully think about it first. Non-aligned fit_params is one proposition to implement the new warm start API https://github.com/scikit-learn/scikit-learn/pull/15105
We might also want to add feature-aligned params in the future, who knows