question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support to Imbalanced-learn pipelines in ClassifierExplainer

See original GitHub issue

When I try to generate a `ClassifierExplainer``on a imblearn pipeline I get the following error:

TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough'
'SMOTETomek(random_state=42)' (type <class 'imblearn.combine._smote_tomek.SMOTETomek'>) doesn't

Full traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-75-2a92cf12a19d> in <module>
      1 from explainerdashboard import ClassifierExplainer, InlineExplainer
----> 2 explainer = ClassifierExplainer(best_model, X_test, y_test)

/opt/conda/lib/python3.7/site-packages/explainerdashboard/explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision, labels, pos_label)
   1999                             cats, cats_notencoded, idxs, index_name, target,
   2000                             descriptions, n_jobs, permutation_cv, cv, na_fill,
-> 2001                             precision)
   2002 
   2003         assert hasattr(model, "predict_proba"), \

/opt/conda/lib/python3.7/site-packages/explainerdashboard/explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, cats_notencoded, idxs, index_name, target, descriptions, n_jobs, permutation_cv, cv, na_fill, precision)
    138             if shap != 'kernel':
    139                 pipeline_model = model.steps[-1][1]
--> 140                 pipeline_transformer = Pipeline(model.steps[:-1])
    141                 if hasattr(model, "predict") and hasattr(pipeline_transformer, "transform"):
    142                     X_transformed = pipeline_transformer.transform(X)

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in __init__(self, steps, memory, verbose)
    116         self.memory = memory
    117         self.verbose = verbose
--> 118         self._validate_steps()
    119 
    120     def get_params(self, deep=True):

/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py in _validate_steps(self)
    169                                 "transformers and implement fit and transform "
    170                                 "or be the string 'passthrough' "
--> 171                                 "'%s' (type %s) doesn't" % (t, type(t)))
    172 
    173         # We allow last estimator to be None as an identity transformation

TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'SMOTETomek(random_state=42)' (type <class 'imblearn.combine._smote_tomek.SMOTETomek'>) doesn't

Versions System: python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0] executable: /opt/conda/bin/python machine: Linux-5.4.120±x86_64-with-debian-buster-sid

Python dependencies: pip: 21.1.2 setuptools: 49.6.0.post20210108 sklearn: 0.24.2 imblearn: 0.8.0 explainerdashboard: latest numpy: 1.19.5 scipy: 1.6.3 Cython: 0.29.23 pandas: 1.2.4 matplotlib: 3.4.2 joblib: 1.0.1 threadpoolctl: 2.1.0

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
oegedijkcommented, Dec 26, 2021

I’m not very familiar with these types of pipelines, but seems you could do something like this to get the transformed input dataframe, and then extract the model:

X_transformed = pd.DataFrame(
     make_pipeline(*[step[1] for step in pipeline.steps[:-1]]).transform(X_test),
     columns=X_test.columns
 )
 model = steps[-1][1]

explainer = ClassifierExplainer(model, X_transformed, y_test)

actually maybe I could add some support for this…

0reactions
Abdelgha-4commented, Dec 25, 2021

Ah I see, this is regrettable as I would love to exploit your works for imbalanced data models. Is there any work around for this in the time being ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pipeline — Version 0.10.0 - Imbalanced-Learn
The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables...
Read more >
SMOTE for Imbalanced Classification with Python
The imbalanced-learn library supports random undersampling via the RandomUnderSampler class. We can update the example to first oversample the ...
Read more >
Strategies for Imbalanced Data Daniel Foley
This article discusses 4 proven strategies for dealing with Class Imbalances in Machine Learning prediction using Python.
Read more >
Imbalanced Data - Medium
Some techniques to manage imbalanced data in python ... Because when we try apply a classification model to this kind of data, it...
Read more >
How to deal with Class Imbalance in Python - Data Analytics
Python packages such as Imbalanced Learn can be used to apply techniques related to under-sampling majority classes, upsampling minority classes ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found