question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question: Integrated Gradient w/ Embedded Categorical Data

See original GitHub issue

Hi Everyone,

Question:

How can I apply integrated gradient to a dataset with numerical and embedded categorical data?

I am somewhat of a beginner with pytorch and the available resources are just not clicking with my use case. The ultimate goal is for me to plot the feature importance of a model, but I am stuck on calculating the attribution. Any help or guidance would be much appreciated.

What I’ve reviewed:

(These resources all have very different data structures(images/sentences) and are confusing for a beginner to translate to an easier tabular numerical/categorical dataset)

My Problem:

Tutorial/Full Code Dataset

Model:

  (all_embeddings): ModuleList(
    (0): Embedding(3, 2)
    (1): Embedding(2, 1)
    (2): Embedding(2, 1)
    (3): Embedding(2, 1)
  )
  (embedding_dropout): Dropout(p=0.4, inplace=False)
  (batch_norm_num): BatchNorm1d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): Linear(in_features=11, out_features=200, bias=True)
    (1): ReLU(inplace=True)
    (2): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.4, inplace=False)
    (4): Linear(in_features=200, out_features=100, bias=True)
    (5): ReLU(inplace=True)
    (6): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.4, inplace=False)
    (8): Linear(in_features=100, out_features=50, bias=True)
    (9): ReLU(inplace=True)
    (10): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (11): Dropout(p=0.4, inplace=False)
    (12): Linear(in_features=50, out_features=2, bias=True)
  )
)

Categorical Data Example:

tensor([[0, 0, 1, 1],
        [2, 0, 0, 1],
        [0, 0, 1, 0],
        [0, 0, 0, 0],
        [2, 0, 1, 1]])

Numerical Data Example

tensor([[6.1900e+02, 4.2000e+01, 2.0000e+00, 0.0000e+00, 1.0000e+00, 1.0135e+05],
        [6.0800e+02, 4.1000e+01, 1.0000e+00, 8.3808e+04, 1.0000e+00, 1.1254e+05],
        [5.0200e+02, 4.2000e+01, 8.0000e+00, 1.5966e+05, 3.0000e+00, 1.1393e+05],
        [6.9900e+02, 3.9000e+01, 1.0000e+00, 0.0000e+00, 2.0000e+00, 9.3827e+04],
        [8.5000e+02, 4.3000e+01, 2.0000e+00, 1.2551e+05, 1.0000e+00, 7.9084e+04]])

Output Data Example

tensor([1, 0, 1, 0, 0])

My Failing Attempt at Attribution

interpretable_embedding = configure_interpretable_embedding_layer(model, 'all_embeddings')

cat_input_embedding = interpretable_embedding.indices_to_embeddings(categorical_train_data).unsqueeze(0)
#I received an error here "NotImplementedError"


ig = IntegratedGradients(model)

ig_attr_train = ig.attribute(inputs=(numerical_train_data, categorical_train_data), baselines=(numerical_train_data * 0.0, cat_input_embedding), target=train_outputs, n_steps=50)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
reggievickcommented, Aug 19, 2020

Awesome, that is much cleaner. I was planning on refactoring once i understood it, but you’ve nailed it here. Thanks so so much @NarineK!

1reaction
NarineKcommented, Aug 19, 2020

yeah, I think we can clean things up and make more modular with something like this:

class CombinedEmbedding(nn.Module):
    def __init__(self, embedding_size):
        super().__init__()
        self.all_embeddings = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in embedding_size])
    
    def forward(self, x_categorical):
        embeddings = []
        for i,e in enumerate(self.all_embeddings):
            print(e(x_categorical[:,i]).shape)
            embeddings.append(e(x_categorical[:,i])) 
        x = torch.cat(embeddings, 1)    
        return x
        
class Model(nn.Module):

    def __init__(self, embedding_size, num_numerical_cols, output_size, layers, p=0.4):
        super().__init__()
        #self.all_embeddings = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in embedding_size])
        self.all_embedding = CombinedEmbedding(embedding_size)
        
        self.embedding_dropout = nn.Dropout(p)
        self.batch_norm_num = nn.BatchNorm1d(num_numerical_cols)

        all_layers = []
        num_categorical_cols = sum((nf for ni, nf in embedding_size))
        input_size = num_categorical_cols + num_numerical_cols

        for i in layers:
            all_layers.append(nn.Linear(input_size, i))
            all_layers.append(nn.ReLU(inplace=True))
            all_layers.append(nn.BatchNorm1d(i))
            all_layers.append(nn.Dropout(p))
            input_size = i

        all_layers.append(nn.Linear(layers[-1], output_size))

        self.layers = nn.Sequential(*all_layers)

    def forward(self, x_categorical, x_numerical):
        x = self.all_embedding(x_categorical)
        x = self.embedding_dropout(x)

        x_numerical = self.batch_norm_num(x_numerical)
        x = torch.cat([x, x_numerical], 1)
        x = self.layers(x)
        return x

Here is all you need for interpretability:


from captum.attr import IntegratedGradients
from captum.attr import configure_interpretable_embedding_layer, remove_interpretable_embedding_layer

interpretable_embedding = configure_interpretable_embedding_layer(model, 'all_embedding')

emb = interpretable_embedding.indices_to_embeddings(categorical_test_data)


ig = IntegratedGradients(model)
ig.attribute((emb, numerical_test_data), target=0)


remove_interpretable_embedding_layer(model, interpretable_embedding)

I didn’t specify baselines. Feel free to specify it too.

Read more comments on GitHub >

github_iconTop Results From Across the Web

The gradient of neural networks w.r.t one-hot encoded inputs
Suppose we trained a neural network f(x) with x one-hot encoded. Now I want to evaluate the importance of each character based on...
Read more >
Tensorflow 2.0 Tutorial on Categorical Features Embedding
A comprehensive guide to categorical features embedding using Tensorflow 2.0 and a practical demo on how to train a neural network with it....
Read more >
Survey on categorical data for neural networks - Gale
This survey investigates current techniques for representing qualitative data for use as input to neural networks. Techniques for using qualitative data in ...
Read more >
Entity Embeddings of Categorical Variables - arXiv Vanity
In this paper we show how to use the entity embedding method to automatically learn the representation of categorical features in multi- ...
Read more >
Categorical Embedding and Transfer Learning
The words/tokens of any language are categorical variables. Machine Learning algorithms are devoted to working with numbers so we have to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found