Init parameter check of estimator_checks overly restrictive
See original GitHub issueDescribe the bug
I’m working on a custom estimator that needs an optimizer as one of its hyperparameters. I’m using the pattern of passing in an uninitialized optimizer and its parameters separately, i.e.
my_estimator = MyEstimator(optimizer=SGD, optimizer__lr=1e-5)
which will then initialize the optimizer as needed. I’m attempting to verify this with the check_parameters_default_constructible
test, but it fails due to this assertion:
which restricts type arguments to numpy floats or numpy ints. I don’t understand why this restriction is necessary. With some digging through the history, I found that the general type checks were added in 90d5ef17f88916fb8f2ab6e27d6b681bf3b011ec to verify that all parameters are of immutable types. This was then extended by the restriction on type parameters in ab2f539a32b8099a941cefc598c9625e830ecfe4, as a drive-by change without explanation. Since types are immutable, I don’t understand why this was necessary.
I’m not sure if this is a bug or I am missing something. CC @amueller who added that check originally.
Steps/Code to Reproduce
Define an estimator such as
from keras.optimizers import SGD
from sklearn.base import BaseEstimator
from sklearn.utils.estimator_checks import check_parameters_default_constructible
class MyEstimator(BaseEstimator):
def __init__(self, optimizer=SGD, **kwargs):
self.optimizer=optimizer
check_parameters_default_constructible("default_constructible", MyEstimator)
Expected Results
The test passes since the estimator is default constructible and all arguments are of immutable type (the optimizer itself is of course not immutable, but the type is).
Actual Results
$ python3 example.py
Using TensorFlow backend.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
Traceback (most recent call last):
File "tmp.py", line 9, in <module>
check_parameters_default_constructible("default_constructible", MyEstimator)
File "/nix/store/isa1ilgb10xpkm6hjxaaw9m7g1xiiqp1-python3-3.7.7-env/lib/python3.7/site-packages/sklearn/utils/estimator_checks.py", line 2533, in check_parameters_default_constructible
assert init_param.default in [np.float64, np.int64]
AssertionError
Versions
System:
python: 3.7.7 (default, Mar 10 2020, 06:34:06) [GCC 9.3.0]
executable: /nix/store/isa1ilgb10xpkm6hjxaaw9m7g1xiiqp1-python3-3.7.7-env/bin/python3.7
machine: Linux-5.4.46-x86_64-with
Python dependencies:
pip: None
setuptools: 45.2.0.post20200508
sklearn: 0.22.2.post1
numpy: 1.18.3
scipy: 1.4.1
Cython: None
pandas: 1.0.3
matplotlib: 3.2.1
joblib: 0.14.1
Built with OpenMP: True
(the relevant code hasn’t changed in current sklearn master though).
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (13 by maintainers)
Top GitHub Comments
Okay lets move forward with the suggestion by @rth https://github.com/scikit-learn/scikit-learn/issues/17756#issuecomment-652469631 to be more accepting of defaults.
The PR is now ready for review: https://github.com/scikit-learn/scikit-learn/pull/17936