Pipeline requires both fit and transform method to be available instead of only fit_transform
See original GitHub issueDescribe the bug
Calling a pipeline with a nonparametric function causes an error since the function transform()
is missing. The pipeline itself calls the function fit_transform()
if it’s present. For nonparametric functions (the most prominent being t-SNE) a regular transform()
method does not exist since there is no projection or mapping that is learned. It could still be used for dimensionality reduction.
Steps/Code to Reproduce
Example:
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.pipeline import make_pipeline
make_pipeline(TSNE(), PCA())
Expected Results
A pipeline is created.
Actual Results
Output:
TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'TSNE(angle=0.5,...
Possible Solution
Editing this https://github.com/scikit-learn/scikit-learn/blob/95d4f0841/sklearn/pipeline.py#L179 to the following in order to reflect the change:
if (not hasattr(t, "fit_transform")) or not (hasattr(t, "fit") and hasattr(t, "transform")):
Versions
import sklearn; sklearn.show_versions()
System:
python: 3.8.2 (default, Feb 26 2020, 22:21:03) [GCC 9.2.1 20200130]
executable: /usr/bin/python3
machine: Linux-5.5.9-arch1-2-x86_64-with-glibc2.2.5
Python dependencies:
pip: 20.0.2
setuptools: 46.0.0
sklearn: 0.22.2.post1
numpy: 1.18.1
scipy: 1.4.1
Cython: 0.29.15
pandas: 1.0.1
matplotlib: 3.2.0
joblib: 0.14.1
Built with OpenMP: True
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (9 by maintainers)
Top Results From Across the Web
Scikit-Learn PipeLine Fit and Transform Error - Stack Overflow
In sklearn, the fit method needs to return self . That is to accommodate the sort of chaining estimator.fit(X,y).transform(X) ,
Read more >sklearn Pipeline
Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit and transform methods. The final estimator only needs to...
Read more >Difference Between fit(), transform(), fit_transform() methods in ...
In the fit() method, where we use the required formula and perform the calculation on the feature values of input data and fit...
Read more >What's the difference between fit and fit_transform in scikit ...
fit_transform () joins these two steps and is used for the initial fitting of parameters on the training set x, while also returning...
Read more >Pipelines & Custom Transformers in Scikit-learn
In the transform method, we apply the parameters learned in fit to unseen data. Bear in mind that the preprocessing is going to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
if_delegate_has_method does not check the intermediate estimators, only the final one.
FeatureUnion._validate_transformers handles all the transformers identically. Pipeline._validate_steps has to handle the last step differently.
I’m having this error, can you please tell me how to solve it.