question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LRP for text classification - DeepExplain context and attribution sum

See original GitHub issue

Hi all and thank you very much Marcoancona for providing your implementation to explain NNs! It’s very valuable. I am currently working on text classification and I would like to understand which words contributed to the decision of my classifier. As there is no NLP example in this project, I followed your pseudocode and guidelines and I wrote the following code to classify quotations extracted from 4 UK newspapers into the original news sources. In this sample dataset there are only 500 quotes in total and 4 classes (newspapers). I uploaded the data as numpy arrays here. The data is already preprocessed, tokenized, transformed into vectors and padded.

Given the code I share below, my question is why do I get the “You might have forgot to (re)create your graph within the DeepExlain context” warning, even though I reconstruct the model in the deepExplain context? As I am a relative beginner in Keras, I am also unsure whether the code inside the DeepExplain corresponds to the pseudocode you provided. Lastly, I didn’t get how to find the attributions per word as you described it. I am not sure what to sum and also how to find the initial words (not the vectors) that the attributions correspond to. I appreciate any hint! Thanks a lot PS. The model performs poorly, but it’s just a toy example to get familiar with DeepExplain and Keras

# data processing
import numpy as np
import keras
# classification
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, Flatten
import tensorflow as tf

# get data (already split and preproccessed)
# https://drive.google.com/file/d/19Fil-e8x20n_bP9Art8H0spbuO53i1Yx/view?usp=sharing
X_train = np.loadtxt('quotes_X_train.txt', dtype=int);
X_test = np.loadtxt('quotes_X_test.txt', dtype=int);
y_train = np.loadtxt('quotes_y_train.txt', dtype=int);
y_test = np.loadtxt('quotes_y_test.txt', dtype=int);

# build MLP
model = Sequential();
model.add(Embedding(input_dim=4218+1, output_dim=32, input_length=100));
model.add(Flatten());
model.add(Dense(100, activation='relu'));
model.add(Dropout(0.5));
model.add(Dense(4, activation='softmax'));
model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy']);
print(model.summary());

# fit and predict
model.fit(X_train, y_train,
          batch_size=32,
          epochs=5,
          validation_data=(X_test, y_test),
          verbose=1,
          shuffle=True);
y_pred = model.predict(np.array(X_test));
y_test = np.array(y_test);

# try to explain the model
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tempfile, sys, os
sys.path.insert(0, os.path.abspath('..'))

# Import DeepExplain
from deepexplain.tensorflow import DeepExplain

current_session = K.get_session();

with DeepExplain(session=current_session) as de:  # <-- init DeepExplain context
        
    # Get input tensor
    input_tensor = model.layers[0].input;
    print("input_tensor --- {}".format(input_tensor));
    # Get embedding tensor
    embedding_tensor = model.layers[0].output;
    print("embedding_tensor --- {}".format(embedding_tensor));
    # Get tensor before the final activation
    pre_softmax_tensor = model.layers[-1].output;
    print("pre_softmax_tensor --- {} ".format(pre_softmax_tensor));
    # Create model until before softmax
    fModel = Model(inputs=input_tensor, outputs = model.layers[-1].output)
        
    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test})

    xs = X_test
    ys = y_test

    # Run DeepExplain with the embedding as input
    print("\nCalling deep explain soon ....\n");
    print("pre_softmax_tensor * ys shape --- {}".format((pre_softmax_tensor * ys).shape));
    print("embedding_tensor shape --- {}".format(embedding_tensor.shape));
    print("embedding_out shape --- {}\n".format(embedding_out.shape));
    attributions = de.explain('elrp', pre_softmax_tensor * ys, embedding_tensor, embedding_out)
    print("attributions shape --- {}".format(attributions.shape));

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
marcoanconacommented, Mar 23, 2018

I also had some difficulty recreating the graph correctly in the DeepExplain context. I will have to think about it, in the meanwhile I suggest to use a single model, create and train it within the DeepExplain context:

with DeepExplain(session=current_session) as de:  # <-- init DeepExplain context
    
    model = Sequential();
    model.add(Embedding(input_dim=4218+1, output_dim=32, input_length=100)); # input_length=29;, input_dim=max_words
    model.add(Flatten());
    model.add(Dense(100, activation='relu')); # input_shape=(max_words,)
    model.add(Dropout(0.5));
    #model.add(Dense(4, activation='softmax'));
    model.add(Dense(4, activation='linear'));
    model.add(Activation('softmax'));
    model.compile(loss='categorical_crossentropy',
                      optimizer='adam',
                      metrics=['accuracy']);
    model.summary();
    
    model.fit(X_train, y_train,
          batch_size=32,
          epochs=5,
          validation_data=(X_test, y_test),
          verbose=1,
          shuffle=True);

    # predict on test data
    y_pred = model.predict(np.array(X_test));
    y_test = np.array(y_test);
    
    # Evaluate the embedding tensor on the model input (in other words, perform the lookup)
    embedding_tensor = model.layers[0].output
    input_tensor = model.inputs[0]
    embedding_out = current_session.run(embedding_tensor, {input_tensor: X_test});

    xs = X_test;
    ys = y_test;
    # Run DeepExplain with the embedding as input
    attributions = de.explain('elrp', model.layers[-2].output * ys, model.layers[1].input, embedding_out);
    print("attributions shape --- {}".format(attributions.shape));
0reactions
pramitchoudharycommented, Apr 11, 2018

Oh yes, that’s right. Thanks for catching that. Also, agreed on this version of LRP not being the right choice for LSTM. I have updated the code One can build a model and then persist it outside the DeepExplain context. Then the model can be loaded into context and the relevant algorithm(Integrated Gradient or) could be applied.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deep Neural Network Attribution Methods for Leakage ...
Compared to gradient-based attribution methods such as saliency, LRP is ap- plicable to any network with monotonous activation units (even non-continuous).
Read more >
DeepExplain - GitHub Pages
The goal of an attribution method is to determine a real value R(x_i) for each input feature, with respect to a target neuron...
Read more >
Explaining Deep Neural Networks and Beyond - arXiv
In the context of ML classification, the function output can be interpreted as the amount of evidence for/against deciding in favor of a...
Read more >
Explainable AI: A Review of Machine Learning Interpretability ...
Under this variation, a much desired property, which is known as completeness or Efficiency [28] or Summation to Delta [29], is satisfied: the...
Read more >
Chapter 4. Model Explainability and Interpretability - O'Reilly
For example, instead of training a model on a classification dataset, ... ML models pre-trained on a large corpus of text and fine-tunable...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found