question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pipelines: standard error when calling transform/predict before fit

See original GitHub issue

In #969 (issue #851) we added an universal error for predict/transform before fitting for components using a metaclass to wrap predict/transform to check if the component has been fitted. We should add the same treatment for pipelines so that anything that inherits from PipelineBase receives the same validation.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
dsherrycommented, Aug 13, 2020

@angela97lin tl;dr: ok, sounds good!

We went with the metaclass strategy for checking this stuff in the components because our components are intended to be extended with custom fit, transform and predict methods, and the only sure way to get our validation code into those definitions was to inject them at class definition-time via the metaclass.

We’re now in a place where our pipeline classes are intended as a template to wrap a graph of components, rather than directly holding custom implementations of fit/predict, although that is still supported and should remain supported for the time being. I think that’s great. An implication of this is the metaclass strategy is not necessary to satisfy the goal of this issue, raising a clear error message when pipeline predict is called before fit.

However, implementing a pipeline metaclass now is an investment which would be nice to make. We have at least a couple issues outstanding tracking class definition-time validation of various fields on both pipelines and components. And a metaclass is almost certainly the right choice to get that done. So, full speed ahead! 🏎️😁

1reaction
dsherrycommented, Aug 5, 2020

Yep. My thought is that we should generalize the metaclass pattern @jeremyliweishih added for components, and then use it on both components and pipelines.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn pipelines with fit_transfrom or predict objects instead ...
transform() and .fit() method the pipeline can call and the output of the transformer needs to match the required input of the lightgbm...
Read more >
sklearn Pipeline
Call fit_transform of each transformer in the pipeline. The transformed data are finally passed to the final estimator that calls fit_predict method.
Read more >
mlens.parallel — mlens 0.2.3 documentation - ML-Ensemble
Implements backend graph managers, base classes for interacting with graph managers, and job managers for preprocessing pipelines and estimators, as well as ...
Read more >
Pipelines & Custom Transformers in Scikit-learn
Like other estimators, these are represented by classes with a fit method, which learns model parameters (e.g. mean and standard deviation ...
Read more >
Troubleshoot pipeline runs - Azure DevOps - Microsoft Learn
CI builds of an Other Git repo will stop running until someone signs in again. UI settings override YAML trigger setting. YAML pipelines...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found