question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Integrated Gradient For Intent Classification & Ner

See original GitHub issue

Hi everyone,

I’m trying to use Camptum with OneNet using Allennlp, this is the model structure

OneNet(
  (text_field_embedder): BasicTextFieldEmbedder(
    (token_embedder_token_characters): TokenCharactersEncoder(
      (_embedding): TimeDistributed(
        (_module): Embedding()
      )
      (_encoder): TimeDistributed(
        (_module): CnnEncoder(
          (_activation): ReLU()
          (conv_layer_0): Conv1d(3, 128, kernel_size=(3,), stride=(1,))
        )
      )
    )
    (token_embedder_tokens): Embedding()
  )
  (encoder): PytorchSeq2SeqWrapper(
    (_module): LSTM(178, 200, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
  )
  (dropout): Dropout(p=0.5, inplace=False)
  (tag_projection_layer): TimeDistributed(
    (_module): Linear(in_features=400, out_features=57, bias=True)
  )
  (intent_projection_layer): Linear(in_features=400, out_features=20, bias=True)
  (ce_loss): CrossEntropyLoss()
)

And this is a sample output

sample = "xe day con mau vàng k sh"

{'tag_logits': array([[  8.193845,  18.159725,   3.070817,  -3.669226, ...,  -7.021739,  -9.783165, -10.414617, -14.490005],
        [ 11.836643,   4.574325,  17.798481,  -0.146769, ...,  -7.323572,  -8.025657, -11.729625, -15.194502],
        [ 13.941337,  -3.660825,   8.000876,   2.282541, ...,  -9.944183, -12.441767, -12.626238, -19.164455],
        [  4.384309,  -4.350079,  -4.387915,   2.233547, ...,  -9.741117,  -9.724459, -12.436659, -15.250616],
        [  2.312785,  -6.687048,  -6.087758,  -2.759617, ...,  -3.623748,   1.016447,  -6.195989,  -5.572791],
        [ 16.199913,  -3.463409,  -1.805555,  -3.65419 , ...,  -6.689859,  -1.246313,  -6.765724,  -7.277429],
        [ 15.870321,  -0.451358,  -3.963183,  -3.106677, ...,  -7.761865,  -7.660899,  -7.337141, -12.257715]],
       dtype=float32),
 'mask': array([1, 1, 1, 1, 1, 1, 1]),
 'tags': ['B-inform#object_type',
  'I-inform#object_type',
  'O',
  'B-inform#color',
  'I-inform#color',
  'O',
  'O'],
 'intent_probs': array([4.672179e-04, 9.995320e-01, 4.606894e-07, 7.099485e-09, 7.334805e-08, 5.847836e-09, 5.091730e-08, 3.163775e-09,
        3.281502e-09, 2.609285e-09, 1.424896e-11, 1.173377e-08, 1.556584e-09, 5.412061e-08, 4.719907e-08, 1.678568e-08,
        9.755362e-09, 1.716321e-08, 3.199067e-09, 4.867611e-10], dtype=float32),
 'words': ['xe', 'day', 'con', 'mau', 'vàng', 'k', 'sh'],
 'intent': 'inform',
 'span_tags': [('inform#object_type', 'inform#object_type (0, 1): xe day'),
  ('inform#color', 'inform#color (3, 4): mau vàng')],
 'nlu': {'inform': [('inform#object_type',
    'inform#object_type (0, 1): xe day'),
   ('inform#color', 'inform#color (3, 4): mau vàng')]}}

I got an error when used LayerIntegratedGradients,

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 1 and 500 in dimension 0 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

Here is the notebook : https://gist.github.com/ducpvftech/1cdb03429a7b9dbf7036d5c4c511ec45

Can you guide me on how to make this work for both intent classification and ner ?

Thank you for considering my request.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:18 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
NarineKcommented, Nov 30, 2020

Hi @ducpvftech, yes, I understand that in your example you get it working without input_indices. I brought it up as an example to explain LayerIG. LayerIG needs to take torch tensors as input for the inputs that we want to interpret. W.r.t. the NER, is ner prediction score stored in tag_logits ? I think there are two option.

  1. You need to attribute NER score to the inputs of the model similar to what you do for intent now in a separate ig.attribute call and a slightly modified forward function that returns NER score. tag_logits has dimension : 1 x num_tokens x num_embeddings, right ? I think we can sum across num_embeddings dimension and attribute to each tag. Basically, your modified forward function will return summed score for each token one at a time in a loop or you could also do that in a batch. Loop would be easier for the first try.
  2. If you want to combine intent and NER and call attribute once that is representing the summation of intent and NER, you can for example sum intent and NER scores together in the forward function, return that score and attribute w.r.t. that summed score.

Does this make sense ? Thank you, Narine

1reaction
ducpvftechcommented, Nov 24, 2020

Hi @NarineK, thank you for your feedback and sorry for the late response,

In the link Colab you sent, I don’t think using input_indices will make any difference since I passed an embedding to forward function, this will work without any problem with any input sentence.

instance = learn.dataset_reader.text_to_instance(tokens_allennlp)
batch = next(iter(iterator([instance])))
print("input_indices: ", input_indices)
input_embedding = interpretable_embedding.indices_to_embeddings(batch["tokens"])

And in the past, I have tried using input_indices only` but only got an error.

So, I think using IntegratedGradients suits me well.

In my case, this model return multi-output (intent & ner). How I can make IntegratedGradients works for both of them? I mean, there will have an explanation for Intent and an explanation for Ner ???

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Deep Learning Models with Integrated Gradients
Explaining IG using a deep learning model for image classification. Integrated Gradient is built on two axioms which need to be satisfied:.
Read more >
Axiomatic Attribution for Deep Networks - arXiv
Attributions generated by integrated gradients satisfy Implementation Invariance since they are based only on the gradients of the function ...
Read more >
A Rigorous Study of Integrated Gradients Method and ...
(2017) introduced the baseline attribution method of Integrated Gradients (IG). The paper identified a set of desirable axioms for attributions, demonstrated ...
Read more >
Model interpretability with Integrated Gradients - Keras
Integrated Gradients is a technique for attributing a classification model's prediction to its input features.
Read more >
Medical QA Oriented Multi-Task Learning Model for Question ...
Abstract: Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found