LRP for text classification - DeepExplain context and attribution sum
See original GitHub issueHi all and thank you very much Marcoancona for providing your implementation to explain NNs! It’s very valuable. I am currently working on text classification and I would like to understand which words contributed to the decision of my classifier. As there is no NLP example in this project, I followed your pseudocode and guidelines and I wrote the following code to classify quotations extracted from 4 UK newspapers into the original news sources. In this sample dataset there are only 500 quotes in total and 4 classes (newspapers). I uploaded the data as numpy arrays here. The data is already preprocessed, tokenized, transformed into vectors and padded.
Given the code I share below, my question is why do I get the “You might have forgot to (re)create your graph within the DeepExlain context” warning, even though I reconstruct the model in the deepExplain context? As I am a relative beginner in Keras, I am also unsure whether the code inside the DeepExplain corresponds to the pseudocode you provided. Lastly, I didn’t get how to find the attributions per word as you described it. I am not sure what to sum and also how to find the initial words (not the vectors) that the attributions correspond to. I appreciate any hint! Thanks a lot PS. The model performs poorly, but it’s just a toy example to get familiar with DeepExplain and Keras
# data processing
import numpy as np
import keras
# classification
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, Flatten
import tensorflow as tf
# get data (already split and preproccessed)
# https://drive.google.com/file/d/19Fil-e8x20n_bP9Art8H0spbuO53i1Yx/view?usp=sharing
X_train = np.loadtxt('quotes_X_train.txt', dtype=int);
X_test = np.loadtxt('quotes_X_test.txt', dtype=int);
y_train = np.loadtxt('quotes_y_train.txt', dtype=int);
y_test = np.loadtxt('quotes_y_test.txt', dtype=int);
# build MLP
model = Sequential();
model.add(Embedding(input_dim=4218+1, output_dim=32, input_length=100));
model.add(Flatten());
model.add(Dense(100, activation='relu'));
model.add(Dropout(0.5));
model.add(Dense(4, activation='softmax'));
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']);
print(model.summary());
# fit and predict
model.fit(X_train, y_train,
batch_size=32,
epochs=5,
validation_data=(X_test, y_test),
verbose=1,
shuffle=True);
y_pred = model.predict(np.array(X_test));
y_test = np.array(y_test);
# try to explain the model
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tempfile, sys, os
sys.path.insert(0, os.path.abspath('..'))
# Import DeepExplain
from deepexplain.tensorflow import DeepExplain
current_session = K.get_session();
with DeepExplain(session=current_session) as de: # <-- init DeepExplain context
# Get input tensor
input_tensor = model.layers[0].input;
print("input_tensor --- {}".format(input_tensor));
# Get embedding tensor
embedding_tensor = model.layers[0].output;
print("embedding_tensor --- {}".format(embedding_tensor));
# Get tensor before the final activation
pre_softmax_tensor = model.layers[-1].output;
print("pre_softmax_tensor --- {} ".format(pre_softmax_tensor));
# Create model until before softmax
fModel = Model(inputs=input_tensor, outputs = model.layers[-1].output)
# Evaluate the embedding tensor on the model input (in other words, perform the lookup)
embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test})
xs = X_test
ys = y_test
# Run DeepExplain with the embedding as input
print("\nCalling deep explain soon ....\n");
print("pre_softmax_tensor * ys shape --- {}".format((pre_softmax_tensor * ys).shape));
print("embedding_tensor shape --- {}".format(embedding_tensor.shape));
print("embedding_out shape --- {}\n".format(embedding_out.shape));
attributions = de.explain('elrp', pre_softmax_tensor * ys, embedding_tensor, embedding_out)
print("attributions shape --- {}".format(attributions.shape));
Issue Analytics
- State:
- Created 6 years ago
- Comments:10 (5 by maintainers)
Top GitHub Comments
I also had some difficulty recreating the graph correctly in the DeepExplain context. I will have to think about it, in the meanwhile I suggest to use a single model, create and train it within the DeepExplain context:
Oh yes, that’s right. Thanks for catching that. Also, agreed on this version of LRP not being the right choice for LSTM. I have updated the code One can build a model and then persist it outside the DeepExplain context. Then the model can be loaded into context and the relevant algorithm(Integrated Gradient or) could be applied.