Make all non-canonical modules private?
See original GitHub issueThis is maybe for 1.0. Recently we started marking files like model_selection._split
private, so that everything has a single canonical import:
from sklearn.model_selection import cross_val_score
For many (older?) models that’s not the case, we have
from sklearn.linear_model.logistic import LogisticRegression
from sklearn.linear_model import LogisticRegression
etc. I think it would be nice to make all the files that are not the canonical import (according to the API documentation) private (with deprecation obviously). That might be a bit annoying for existing users that used long import paths, but it makes auto-complete much more helpful and the module structure much less confusing for newcomers.
For example sklearn.linear_model.ridge
is a module, while sklearn.linear_model.ridge_regression
is a function that implements ridge regression and sklearn.linear_model.Ridge
is a class that implements ridge regression. From the names this is totally unclear.
Issue Analytics
- State:
- Created 6 years ago
- Comments:30 (30 by maintainers)
Top GitHub Comments
you can import
from sklearn.datasets._base import ...
However, now you are aware that
_base
will be private and you might have to change your code at a new scikit-learn release since we might not provide backward-compatible code.This said, be aware that we are going to define a developer API which will define some backward support for these types of utilities used in third-party.
I had a ton of private datasets, such as names, text and geographical data. I used to use the following utils to download and manage these:
What’s the recommended approach now?