Lime with pipeline
See original GitHub issueMay you show a case how to use LIME with Pipeline? I do
explainer = lime.lime_tabular.LimeTabularExplainer(union, feature_names=names,
class_names=y_train, categorical_features=cat, verbose=True, mode='regression')
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
Where union
is sparse matrix obtained from FeatureUnion
. Pipeline
as below
pipeline = Pipeline([
# Use FeatureUnion to combine the features
('union', FeatureUnion(
transformer_list=[
# text
('text', Pipeline([
('select', norm()),
('tfidf', TfidfVectorizer(max_df=40, min_df=3, ngram_range=(1, 4)))
])),
# categorical
('categorical', Pipeline([
('selector', MultiColumn(columns=['a', 'b'])),
('one_hot', Cat())
])),
# numeric
('numeric', Pipeline([
('date', Age()),
('scaling', preprocessing.MinMaxScaler())
])),
])),
# Use a regression
('model_fitting', xgb),
])
I extract feature names as below
one = pipeline.named_steps['union'].transformer_list[0][1].named_steps['tfidf'].get_feature_names()
for text. Then names of categorical variables
enc = DictVectorizer(sparse = False)
cat= X[['a', 'b,]]
enccat = enc.fit((cat.reset_index(drop=True)).T.to_dict().values())
two = enccat.get_feature_names()
and for numeric just
three = ['age']
Then i get array of feature names
names = one+two+three
names = np.asarray(ferche_names)
cat = np.asarray(two)
I get sparse matrix just using part FeatureUnion
from Pipeline
union = union.fit_transform(X_train)
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Lime - basic usage, two class case - GitHub Pages
For this purpose, we use sklearn's pipeline, and implements predict_proba on raw_text lists. In [6]:. from lime import lime_text from sklearn.pipeline ...
Read more >Pipeline Management: Lime Go | For Sales Managers
Automatic follow-up of activity goals. Lime Go continuously compares your team's sales activities with your goals, allowing you to get a quick understanding ......
Read more >Using Lime for Interpreting NLP - Medium
First, make a pipeline from whatever standardizers, vectorizers, and models you used. I used tf-idf vectorizer and a random forest model in ...
Read more >SP LIME - Jupyter Notebooks Gallery
SP LIME. Regression explainer with boston housing prices dataset ... average='binary') from lime import lime_text from sklearn.pipeline import make_pipeline ...
Read more >eli5.lime — ELI5 0.11.0 documentation - Read the Docs
Parameters: doc (str) – Text to explain; predict_proba (callable) – Black-box classification pipeline. predict_proba should be ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
LimeTabularExplainer assumes the input matrix is dense, and that each column represents a feature (i.e. it’s not one-hot encoded). Unfortunately this would require you to break the pipeline in two, and letting the explainer do the one-hot encoding. (reopen if this doesn’t answer your question)
Hi Marcotcr, may you explain how to break the pipeline into two? I am also interested in how to explain the model using mixed types of data (categorical and text). thanks.