Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Integrated Gradient For Intent Classification & Ner

See original GitHub issue

Hi everyone,

I’m trying to use Camptum with OneNet using Allennlp, this is the model structure

OneNet(
  (text_field_embedder): BasicTextFieldEmbedder(
    (token_embedder_token_characters): TokenCharactersEncoder(
      (_embedding): TimeDistributed(
        (_module): Embedding()
      )
      (_encoder): TimeDistributed(
        (_module): CnnEncoder(
          (_activation): ReLU()
          (conv_layer_0): Conv1d(3, 128, kernel_size=(3,), stride=(1,))
        )
      )
    )
    (token_embedder_tokens): Embedding()
  )
  (encoder): PytorchSeq2SeqWrapper(
    (_module): LSTM(178, 200, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
  )
  (dropout): Dropout(p=0.5, inplace=False)
  (tag_projection_layer): TimeDistributed(
    (_module): Linear(in_features=400, out_features=57, bias=True)
  )
  (intent_projection_layer): Linear(in_features=400, out_features=20, bias=True)
  (ce_loss): CrossEntropyLoss()
)

And this is a sample output

sample = "xe day con mau vàng k sh"

{'tag_logits': array([[  8.193845,  18.159725,   3.070817,  -3.669226, ...,  -7.021739,  -9.783165, -10.414617, -14.490005],
        [ 11.836643,   4.574325,  17.798481,  -0.146769, ...,  -7.323572,  -8.025657, -11.729625, -15.194502],
        [ 13.941337,  -3.660825,   8.000876,   2.282541, ...,  -9.944183, -12.441767, -12.626238, -19.164455],
        [  4.384309,  -4.350079,  -4.387915,   2.233547, ...,  -9.741117,  -9.724459, -12.436659, -15.250616],
        [  2.312785,  -6.687048,  -6.087758,  -2.759617, ...,  -3.623748,   1.016447,  -6.195989,  -5.572791],
        [ 16.199913,  -3.463409,  -1.805555,  -3.65419 , ...,  -6.689859,  -1.246313,  -6.765724,  -7.277429],
        [ 15.870321,  -0.451358,  -3.963183,  -3.106677, ...,  -7.761865,  -7.660899,  -7.337141, -12.257715]],
       dtype=float32),
 'mask': array([1, 1, 1, 1, 1, 1, 1]),
 'tags': ['B-inform#object_type',
  'I-inform#object_type',
  'O',
  'B-inform#color',
  'I-inform#color',
  'O',
  'O'],
 'intent_probs': array([4.672179e-04, 9.995320e-01, 4.606894e-07, 7.099485e-09, 7.334805e-08, 5.847836e-09, 5.091730e-08, 3.163775e-09,
        3.281502e-09, 2.609285e-09, 1.424896e-11, 1.173377e-08, 1.556584e-09, 5.412061e-08, 4.719907e-08, 1.678568e-08,
        9.755362e-09, 1.716321e-08, 3.199067e-09, 4.867611e-10], dtype=float32),
 'words': ['xe', 'day', 'con', 'mau', 'vàng', 'k', 'sh'],
 'intent': 'inform',
 'span_tags': [('inform#object_type', 'inform#object_type (0, 1): xe day'),
  ('inform#color', 'inform#color (3, 4): mau vàng')],
 'nlu': {'inform': [('inform#object_type',
    'inform#object_type (0, 1): xe day'),
   ('inform#color', 'inform#color (3, 4): mau vàng')]}}

I got an error when used LayerIntegratedGradients,

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 1 and 500 in dimension 0 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

Here is the notebook : https://gist.github.com/ducpvftech/1cdb03429a7b9dbf7036d5c4c511ec45

Can you guide me on how to make this work for both intent classification and ner ?

Thank you for considering my request.

Issue Analytics

State:
Created 3 years ago
Comments:18 (9 by maintainers)

Top GitHub Comments

1reaction

NarineKcommented, Nov 30, 2020

Hi @ducpvftech, yes, I understand that in your example you get it working without input_indices. I brought it up as an example to explain LayerIG. LayerIG needs to take torch tensors as input for the inputs that we want to interpret. W.r.t. the NER, is ner prediction score stored in tag_logits ? I think there are two option.

You need to attribute NER score to the inputs of the model similar to what you do for intent now in a separate ig.attribute call and a slightly modified forward function that returns NER score. tag_logits has dimension : 1 x num_tokens x num_embeddings, right ? I think we can sum across num_embeddings dimension and attribute to each tag. Basically, your modified forward function will return summed score for each token one at a time in a loop or you could also do that in a batch. Loop would be easier for the first try.
If you want to combine intent and NER and call attribute once that is representing the summation of intent and NER, you can for example sum intent and NER scores together in the forward function, return that score and attribute w.r.t. that summed score.

Does this make sense ? Thank you, Narine

1reaction

ducpvftechcommented, Nov 24, 2020

Hi @NarineK, thank you for your feedback and sorry for the late response,

In the link Colab you sent, I don’t think using input_indices will make any difference since I passed an embedding to forward function, this will work without any problem with any input sentence.

instance = learn.dataset_reader.text_to_instance(tokens_allennlp)
batch = next(iter(iterator([instance])))
print("input_indices: ", input_indices)
input_embedding = interpretable_embedding.indices_to_embeddings(batch["tokens"])

And in the past, I have tried using input_indices only` but only got an error.

So, I think using IntegratedGradients suits me well.

In my case, this model return multi-output (intent & ner). How I can make IntegratedGradients works for both of them? I mean, there will have an explanation for Intent and an explanation for Ner ???

Top Results From Across the Web

Understanding Deep Learning Models with Integrated Gradients

Explaining IG using a deep learning model for image classification. Integrated Gradient is built on two axioms which need to be satisfied:.

Axiomatic Attribution for Deep Networks - arXiv

Attributions generated by integrated gradients satisfy Implementation Invariance since they are based only on the gradients of the function ...

A Rigorous Study of Integrated Gradients Method and ...

(2017) introduced the baseline attribution method of Integrated Gradients (IG). The paper identified a set of desirable axioms for attributions, demonstrated ...

Model interpretability with Integrated Gradients - Keras

Integrated Gradients is a technique for attributing a classification model's prediction to its input features.

Medical QA Oriented Multi-Task Learning Model for Question ...

Abstract: Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding ...