check_estimator is stricter than what is stated in the Estimator API doc
See original GitHub issueThis issue was raised in a discussion regarding the LightGBM scikit-learn compatible estimators: https://github.com/microsoft/LightGBM/issues/2628#issuecomment-574369813
The problem is that check_estimator complains about private attributes set in the __init__
of a scikit-learn estimator while our documentation just state the following (while not explicitly prohibiting setting private attributes in __init__
):
The arguments accepted by
__init__
should all be keyword arguments with a default value. In other words, a user should be able to instantiate an estimator without passing any arguments to it. The arguments should all correspond to hyperparameters describing the model or the optimisation problem the estimator tries to solve. These initial arguments (or parameters) are always remembered by the estimator. Also note that they should not be documented under the “Attributes” section, but rather under the “Parameters” section for that estimator.In addition, every keyword argument accepted by
__init__
should correspond to an attribute on the instance. Scikit-learn relies on this to find the relevant attributes to set on an estimator when doing model selection.
from: https://scikit-learn.org/0.22/developers/develop.html#instantiation
The strict check is defined in sklearn.utils.estimator_checks.check_no_attributes_set_in_init
.
Issue Analytics
- State:
- Created 4 years ago
- Comments:29 (21 by maintainers)
I think that setting private attributes in
__init__
is fine as long as the public attributes defined as constructor params (that is the model hyperparametrs) can be get/set at any moment either viagetattr
/setattr
or viaget_params
/set_params
and that subsequent calls to thefit
/predict
/transform
automatically take those changes into account.Hello guys!
Is there any progress on this? Will it be possible to resolve this issue by next release?