question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pipeline requires both fit and transform method to be available instead of only fit_transform

See original GitHub issue

Describe the bug

Calling a pipeline with a nonparametric function causes an error since the function transform() is missing. The pipeline itself calls the function fit_transform() if it’s present. For nonparametric functions (the most prominent being t-SNE) a regular transform() method does not exist since there is no projection or mapping that is learned. It could still be used for dimensionality reduction.

Steps/Code to Reproduce

Example:

from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.pipeline import make_pipeline

make_pipeline(TSNE(), PCA())

Expected Results

A pipeline is created.

Actual Results

Output:

TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'TSNE(angle=0.5,...

Possible Solution

Editing this https://github.com/scikit-learn/scikit-learn/blob/95d4f0841/sklearn/pipeline.py#L179 to the following in order to reflect the change:

if (not hasattr(t, "fit_transform")) or not (hasattr(t, "fit") and hasattr(t, "transform")):

Versions

import sklearn; sklearn.show_versions()
System:
    python: 3.8.2 (default, Feb 26 2020, 22:21:03)  [GCC 9.2.1 20200130]
executable: /usr/bin/python3
   machine: Linux-5.5.9-arch1-2-x86_64-with-glibc2.2.5

Python dependencies:
       pip: 20.0.2
setuptools: 46.0.0
   sklearn: 0.22.2.post1
     numpy: 1.18.1
     scipy: 1.4.1
    Cython: 0.29.15
    pandas: 1.0.1
matplotlib: 3.2.0
    joblib: 0.14.1

Built with OpenMP: True

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:11 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
jnothmancommented, Mar 17, 2020

if_delegate_has_method does not check the intermediate estimators, only the final one.

FeatureUnion._validate_transformers handles all the transformers identically. Pipeline._validate_steps has to handle the last step differently.

0reactions
raisa1921commented, Oct 17, 2020

p-1 p-2 I’m having this error, can you please tell me how to solve it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scikit-Learn PipeLine Fit and Transform Error - Stack Overflow
In sklearn, the fit method needs to return self . That is to accommodate the sort of chaining estimator.fit(X,y).transform(X) ,
Read more >
sklearn Pipeline
Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit and transform methods. The final estimator only needs to...
Read more >
Difference Between fit(), transform(), fit_transform() methods in ...
In the fit() method, where we use the required formula and perform the calculation on the feature values of input data and fit...
Read more >
What's the difference between fit and fit_transform in scikit ...
fit_transform () joins these two steps and is used for the initial fitting of parameters on the training set x, while also returning...
Read more >
Pipelines & Custom Transformers in Scikit-learn
In the transform method, we apply the parameters learned in fit to unseen data. Bear in mind that the preprocessing is going to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found