Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

enhancement base class for sklearn models

See original GitHub issue

Describe the workflow you want to enable

Sklearn does not provide an abstract class for the fit method and transform method which is not friendly to code hints. In Python project, type hints is very important. This would be helpful for python type hints if sklearn had abstract classes with fit and transform methods

=================================== i hope a abstract class like this

class IFit:
    def fit(x, y, **params):
        pass

class ITransform:
    def transform(x, y, **params):
        pass

Describe your proposed solution

class IFit(meteclass=abc.ABCMeta):
    def fit(x, y, **params):
        pass

class ITransform(meteclass=abc.ABCMeta):
    def transform(x, y, **params):
        pass

class BaseClassifier(IFit, ITransform, BaseEstimator, ClassifierMixin):
    pass

class BaseRegressor(IFit, ITransform, BaseEstimator, RegressorMixin):
    pass

class BaseCluster(IFit, ITransform, BaseEstimator, ClusterMixin):
    pass

Describe alternatives you’ve considered, if relevant

No response

Additional context

No response

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:6 (4 by maintainers)

Top GitHub Comments

2reactions

thomasjpfancommented, Oct 14, 2021

typing’s Protocol was designed to allow for duck typing. The Python PEP that describes the Protocol has examples that promotes duck typing: https://www.python.org/dev/peps/pep-0544/. One can say a Protocol “explicitly defines an interface”.

The way scikit-learn estimators are duck typed prevents us from having type checking using a class hierarchy. We do not require third party estimators to inherit our classes, such as BaseEstimator. As long as the third party estimator defines the required methods, it should work with sklearn’s functions/meta-estimators. In other words, if we have abstract classes, we can not use them typing in sklearn, because we do not require third party estimators to inherit from them.

0reactions

franklucky001commented, Oct 14, 2021

@thomasjpfan Implicit implementation interface is not pythonic, fit & transform are not standard protocals of python.I know that adding these features requires major changes in the code, this is just a suggestion.

Top Results From Across the Web

1.11. Ensemble methods — scikit-learn 1.2.0 documentation

These methods are used as a way to reduce the variance of a base estimator (e.g., a decision tree), by introducing randomization into...

Developing scikit-learn estimators

The base object, implements a fit method to learn from data, either: ... An estimator is an object that fits a model based...

sklearn.ensemble.GradientBoostingClassifier

Gradient Boosting for classification. This algorithm builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary ...

sklearn.base.BaseEstimator — scikit-learn 1.2.0 documentation

BaseEstimator. Base class for all estimators in scikit-learn. All estimators should specify all the parameters that can be set at the class level...

sklearn.ensemble.RandomForestClassifier

A random forest classifier. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of...