Integrated Gradient For Intent Classification & Ner
See original GitHub issueHi everyone,
I’m trying to use Camptum with OneNet using Allennlp, this is the model structure
OneNet(
(text_field_embedder): BasicTextFieldEmbedder(
(token_embedder_token_characters): TokenCharactersEncoder(
(_embedding): TimeDistributed(
(_module): Embedding()
)
(_encoder): TimeDistributed(
(_module): CnnEncoder(
(_activation): ReLU()
(conv_layer_0): Conv1d(3, 128, kernel_size=(3,), stride=(1,))
)
)
)
(token_embedder_tokens): Embedding()
)
(encoder): PytorchSeq2SeqWrapper(
(_module): LSTM(178, 200, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
)
(dropout): Dropout(p=0.5, inplace=False)
(tag_projection_layer): TimeDistributed(
(_module): Linear(in_features=400, out_features=57, bias=True)
)
(intent_projection_layer): Linear(in_features=400, out_features=20, bias=True)
(ce_loss): CrossEntropyLoss()
)
And this is a sample output
sample = "xe day con mau vàng k sh"
{'tag_logits': array([[ 8.193845, 18.159725, 3.070817, -3.669226, ..., -7.021739, -9.783165, -10.414617, -14.490005],
[ 11.836643, 4.574325, 17.798481, -0.146769, ..., -7.323572, -8.025657, -11.729625, -15.194502],
[ 13.941337, -3.660825, 8.000876, 2.282541, ..., -9.944183, -12.441767, -12.626238, -19.164455],
[ 4.384309, -4.350079, -4.387915, 2.233547, ..., -9.741117, -9.724459, -12.436659, -15.250616],
[ 2.312785, -6.687048, -6.087758, -2.759617, ..., -3.623748, 1.016447, -6.195989, -5.572791],
[ 16.199913, -3.463409, -1.805555, -3.65419 , ..., -6.689859, -1.246313, -6.765724, -7.277429],
[ 15.870321, -0.451358, -3.963183, -3.106677, ..., -7.761865, -7.660899, -7.337141, -12.257715]],
dtype=float32),
'mask': array([1, 1, 1, 1, 1, 1, 1]),
'tags': ['B-inform#object_type',
'I-inform#object_type',
'O',
'B-inform#color',
'I-inform#color',
'O',
'O'],
'intent_probs': array([4.672179e-04, 9.995320e-01, 4.606894e-07, 7.099485e-09, 7.334805e-08, 5.847836e-09, 5.091730e-08, 3.163775e-09,
3.281502e-09, 2.609285e-09, 1.424896e-11, 1.173377e-08, 1.556584e-09, 5.412061e-08, 4.719907e-08, 1.678568e-08,
9.755362e-09, 1.716321e-08, 3.199067e-09, 4.867611e-10], dtype=float32),
'words': ['xe', 'day', 'con', 'mau', 'vàng', 'k', 'sh'],
'intent': 'inform',
'span_tags': [('inform#object_type', 'inform#object_type (0, 1): xe day'),
('inform#color', 'inform#color (3, 4): mau vàng')],
'nlu': {'inform': [('inform#object_type',
'inform#object_type (0, 1): xe day'),
('inform#color', 'inform#color (3, 4): mau vàng')]}}
I got an error when used LayerIntegratedGradients
,
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 1 and 500 in dimension 0 at /pytorch/aten/src/TH/generic/THTensor.cpp:612
Here is the notebook : https://gist.github.com/ducpvftech/1cdb03429a7b9dbf7036d5c4c511ec45
Can you guide me on how to make this work for both intent classification and ner ?
Thank you for considering my request.
Issue Analytics
- State:
- Created 3 years ago
- Comments:18 (9 by maintainers)
Top Results From Across the Web
Understanding Deep Learning Models with Integrated Gradients
Explaining IG using a deep learning model for image classification. Integrated Gradient is built on two axioms which need to be satisfied:.
Read more >Axiomatic Attribution for Deep Networks - arXiv
Attributions generated by integrated gradients satisfy Implementation Invariance since they are based only on the gradients of the function ...
Read more >A Rigorous Study of Integrated Gradients Method and ...
(2017) introduced the baseline attribution method of Integrated Gradients (IG). The paper identified a set of desirable axioms for attributions, demonstrated ...
Read more >Model interpretability with Integrated Gradients - Keras
Integrated Gradients is a technique for attributing a classification model's prediction to its input features.
Read more >Medical QA Oriented Multi-Task Learning Model for Question ...
Abstract: Intent classification and named entity recognition of medical questions are two key subtasks of the natural language understanding ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @ducpvftech, yes, I understand that in your example you get it working without
input_indices
. I brought it up as an example to explain LayerIG. LayerIG needs to take torch tensors as input for the inputs that we want to interpret. W.r.t. the NER, is ner prediction score stored intag_logits
? I think there are two option.ig.attribute
call and a slightly modified forward function that returnsNER
score.tag_logits
has dimension : 1 x num_tokens x num_embeddings, right ? I think we can sum across num_embeddings dimension and attribute to each tag. Basically, your modified forward function will return summed score for each token one at a time in a loop or you could also do that in a batch. Loop would be easier for the first try.Does this make sense ? Thank you, Narine
Hi @NarineK, thank you for your feedback and sorry for the late response,
In the link Colab you sent, I don’t think using
input_indices
will make any difference since I passed an embedding to forward function, this will work without any problem with any input sentence.And in the past, I have tried using
input_indices
only` but only got an error.So, I think using
IntegratedGradients
suits me well.In my case, this model return multi-output (intent & ner). How I can make
IntegratedGradients
works for both of them? I mean, there will have an explanation for Intent and an explanation for Ner ???