Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long

See original GitHub issue

Hello,

I am pretty new to captum. I am trying to run the IntegratedGradient method for transformer based sequence classification. However I get the following error:

RuntimeError: Expected tensor for argument #1 ‘indices’ to have scalar type Long; but got torch.FloatTensor instead (while checking arguments for embedding)

Here is my code:

import os
import sys

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import torch
import torch.nn as nn

from transformers import BertTokenizer, BertForQuestionAnswering, BertConfig
from transformers import RobertaTokenizer, RobertaForSequenceClassification, RobertaConfig

from sklearn.preprocessing import LabelEncoder

from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer
import re

device = torch.device("cpu")
# load model
model = RobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=4,
                                                                      output_attentions=False,
                                                                      output_hidden_states=False,)
model.to(device)
model.eval()
model.zero_grad()

# load tokenizer
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

def encode_sentences(tokenizer, article_list, evidence_list, max_len=512):
    """
    Encodes article/evidence pair and returns tensor
    :param tokenizer: tokenizer to use to encode
    :param article_list: list of str
    :param evidence_list: list of str
    :param max_len: int; max length of encoding
    :return: encoded tensors
    """

    input_ids = []
    attention_masks = []

    # ensure length of article_list and evidence_list are equal
    zip_list = zip(cycle(article_list), evidence_list) if len(article_list) < len(evidence_list) else zip(article_list, evidence_list)

    for article, evidence in zip_list:
        encoded_dict = tokenizer.encode_plus(
            text=evidence,  # Sentence to encode.
            text_pair=article,
            add_special_tokens=True,  # Add '[CLS]' and '[SEP]'
            max_length=max_len,  # Pad & truncate all sentences.
            pad_to_max_length=True,
            return_attention_mask=True,  # Construct attn. masks.
            return_tensors='pt',  # Return pytorch tensors.
        )

        input_ids.append(encoded_dict['input_ids'])
        attention_masks.append(encoded_dict['attention_mask'])

    # Convert the lists into tensors.
    input_ids = torch.cat(input_ids, dim=0)
    attention_masks = torch.cat(attention_masks, dim=0)

    return input_ids, attention_masks

def predict_instance(input_ids, attention_masks):
        with torch.no_grad():
            outputs = model(input_ids, token_type_ids=None, attention_mask=attention_masks)

        logits = outputs[0]
        logits = logits.detach().cpu().numpy()
        softmax_values = softmax([logits[0]]).flatten()
        type(softmax_values)
        return softmax_values

article = "Manchester United is one of the most successful English football club in the history. The club has been performing poorely since their stark manager Alex Ferguson retired. However recently appointed manager, Ole is showing signs of promise in the rebuilding process."

evidence = "Ole Gunner Solskjaer is doing an excellent job with Manchester United"
input_ids, masks = encode_sentences(tokenizer, [article], [evidence])

# applying integrated gradients on the SoftmaxModel and input data point
ig = IntegratedGradients(predict_instance)
attributions, approximation_error = ig.attribute((input_ids, masks), target=target_class_index, return_convergence_delta=True)

# The input and returned corresponding attribution have the
# same shape and dimensionality.

assert attributions.shape == input.shape

This issue seems very similar to this one: https://github.com/huggingface/transformers/issues/2952

I have tried casting the input_ids and masks to long tensor according the the suggestion. But it did not help.

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

sajjadriajcommented, Jun 14, 2020

Hello @vivekmig Thank you so much for pointing me towards my mistake. I have followed the IMDB example, However I am getting the following error now:

AssertionError: Cannot choose target column with output shape torch.Size([4]).

Here is my updated code:

#%%
from transformers import RobertaTokenizer, RobertaConfig, RobertaForSequenceClassification, RobertaModel
import torch
import torch.nn as nn
import torch.nn.functional as F


from captum.attr import visualization as viz
from captum.attr import IntegratedGradients, LayerConductance, LayerIntegratedGradients, TokenReferenceBase
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

#%%

class SampleModel(nn.Module):
    def __init__(self):
        super(SampleModel, self).__init__()
        self.transformer_model = RobertaModel.from_pretrained('roberta-base',output_hidden_states=True)
        self.linear = nn.Linear(768, 4)

    def forward(self, input_ids, attention_masks):
        out = self.transformer_model(input_ids, token_type_ids=None, attention_mask=attention_masks) 
        out = out[0].squeeze(0).mean(0)
        out = self.linear(out)
        return out

#%%
def encode_sentences(tokenizer, article_list, evidence_list, max_len=512):
    input_ids = []
    attention_masks = []

    # ensure length of article_list and evidence_list are equal
    zip_list = zip(cycle(article_list), evidence_list) if len(article_list) < len(evidence_list) else zip(article_list, evidence_list)

    for article, evidence in zip_list:
        encoded_dict = tokenizer.encode_plus(
            text=evidence,  # Sentence to encode.
            text_pair=article,
            add_special_tokens=True,  # Add '[CLS]' and '[SEP]'
            max_length=max_len,  # Pad & truncate all sentences.
            pad_to_max_length=True,
            return_attention_mask=True,  # Construct attn. masks.
            return_tensors='pt',  # Return pytorch tensors.
        )

        input_ids.append(encoded_dict['input_ids'])
        attention_masks.append(encoded_dict['attention_mask'])

    # Convert the lists into tensors.
    input_ids = torch.cat(input_ids, dim=0)
    attention_masks = torch.cat(attention_masks, dim=0)

    return input_ids, attention_masks

def forward_with_softmax(input_ids, masks):
    output = model(input_ids, masks)
    return torch.max(F.softmax(output), dim=0)[1]

#%%
article = "Manchester United is one of the most successful English football club in the history. The club has been performing poorely since their stark manager Alex Ferguson retired. However recently appointed manager, Ole is showing signs of promise in the rebuilding process."

evidence = "Ole Gunner Solskjaer is doing an excellent job with Manchester United"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
input_ids, masks = encode_sentences(tokenizer, [article], [evidence])
input_ids = input_ids.to(device)
masks = masks.to(device)

#%%
model = SampleModel().to(device)
lig = LayerIntegratedGradients(forward_with_softmax, model.transformer_model.embeddings)
token_reference = TokenReferenceBase(reference_token_idx=tokenizer.pad_token_id)
reference_indices = token_reference.generate_reference(sequence_length = 512, device = device).unsqueeze(0)

# %%
out = model(input_ids, masks)

# %%
# model.transformer_model.get_input_embeddings()
# %%
attr, delta = lig.attribute(inputs=input_ids, baselines= reference_indices, additional_forward_args=(masks), return_convergence_delta=True, n_steps=10, target=1)
print(attr)

1reaction

vivekmigcommented, Jun 11, 2020

Hi @sajjadriaj , it seems that the issue here is that you are trying to compute attributions with respect to token indices, but we cannot actually compute gradients with respect to these indices, only to the corresponding embeddings. More information regarding this can be found in this FAQ answer.

To compute Integrated Gradients for tokens, you would need to attribute with respect to word embeddings rather than input indices, which can be done either by overriding the embedding layer or using a LayerAttribution method. Examples of these can be found in our BERT or IMDB tutorials here.

Top Results From Across the Web

Expected tensor for argument #1 'indices' to have scalar type ...

RuntimeError : Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding) ...

Expected tensor for argument #1 'indices' to have scalar type ...

python 3. x - Pytorch RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got CUDAType instead - Stack ......

Expected tensor for argument #1 'indices' to ... - PyTorch Forums

RuntimeError : Expected tensor for argument #1 'indices' to have scalar type Long; but got CPUFloatTensor instead (while checking arguments ...

"Expected tensor for argument #1 'indices' to have scalar type ...

Error message : RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.

Expected tensor for argument #1 'indices' to have scalar type ...

I am trying to re-execute a GitHub project on my computer for recommendation using embedding, the goal is to first embed the user...