Add support to Imbalanced-learn pipelines in ClassifierExplainer
See original GitHub issueWhen I try to generate a `ClassifierExplainer``on a imblearn pipeline I get the following error:
TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough'
'SMOTETomek(random_state=42)' (type <class 'imblearn.combine._smote_tomek.SMOTETomek'>) doesn't
Full traceback:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-75-2a92cf12a19d> in <module>
1 from explainerdashboard import ClassifierExplainer, InlineExplainer
----> 2 explainer = ClassifierExplainer(best_model, X_test, y_test)
/opt/conda/lib/python3.7/site-packages/explainerdashboard/explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision, labels, pos_label)
1999 cats, cats_notencoded, idxs, index_name, target,
2000 descriptions, n_jobs, permutation_cv, cv, na_fill,
-> 2001 precision)
2002
2003 assert hasattr(model, "predict_proba"), \
/opt/conda/lib/python3.7/site-packages/explainerdashboard/explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision)
138 if shap != 'kernel':
139 pipeline_model = model.steps[-1][1]
--> 140 pipeline_transformer = Pipeline(model.steps[:-1])
141 if hasattr(model, "predict") and hasattr(pipeline_transformer, "transform"):
142 X_transformed = pipeline_transformer.transform(X)
/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in __init__(self, steps, memory, verbose)
116 self.memory = memory
117 self.verbose = verbose
--> 118 self._validate_steps()
119
120 def get_params(self, deep=True):
/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _validate_steps(self)
169 "transformers and implement fit and transform "
170 "or be the string 'passthrough' "
--> 171 "'%s' (type %s) doesn't" % (t, type(t)))
172
173 # We allow last estimator to be None as an identity transformation
TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'SMOTETomek(random_state=42)' (type <class 'imblearn.combine._smote_tomek.SMOTETomek'>) doesn't
Versions System: python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0] executable: /opt/conda/bin/python machine: Linux-5.4.120±x86_64-with-debian-buster-sid
Python dependencies: pip: 21.1.2 setuptools: 49.6.0.post20210108 sklearn: 0.24.2 imblearn: 0.8.0 explainerdashboard: latest numpy: 1.19.5 scipy: 1.6.3 Cython: 0.29.23 pandas: 1.2.4 matplotlib: 3.4.2 joblib: 1.0.1 threadpoolctl: 2.1.0
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Pipeline — Version 0.10.0 - Imbalanced-Learn
The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables...
Read more >SMOTE for Imbalanced Classification with Python
The imbalanced-learn library supports random undersampling via the RandomUnderSampler class. We can update the example to first oversample the ...
Read more >Strategies for Imbalanced Data Daniel Foley
This article discusses 4 proven strategies for dealing with Class Imbalances in Machine Learning prediction using Python.
Read more >Imbalanced Data - Medium
Some techniques to manage imbalanced data in python ... Because when we try apply a classification model to this kind of data, it...
Read more >How to deal with Class Imbalance in Python - Data Analytics
Python packages such as Imbalanced Learn can be used to apply techniques related to under-sampling majority classes, upsampling minority classes ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m not very familiar with these types of pipelines, but seems you could do something like this to get the transformed input dataframe, and then extract the model:
actually maybe I could add some support for this…
Ah I see, this is regrettable as I would love to exploit your works for imbalanced data models. Is there any work around for this in the time being ?