Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LayerDeepLift fails when used on a MaxPooling layer?

See original GitHub issue

I am trying to use LayerDeepLift on multiple layers of a VGG16 model from torchvision.models. It works for all layers except MaxPooling2D layers.

The following (layer 23 is a MaxPool2d layer):

model = torchvision.models.vgg16(pretrained=True)
u = captum.attr.LayerDeepLift(
    model, list(model.features.children())[23]).attribute(
        torch_im[None, ...], target=156)[0]

Raises the following:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-66-668b5d33db17> in <module>
----> 1 u = captum.attr.LayerDeepLift(model, list(model.features.children())[23]).attribute(torch_im[None, ...], target=156)[0]

i:\languages\python\envs\deel-torch\lib\site-packages\captum\attr\_core\layer\layer_deep_lift.py in attribute(self, inputs, baselines, target, additional_forward_args, return_convergence_delta, attribute_to_layer_input, custom_attribution_func)
    306             inputs,
    307             attribute_to_layer_input=attribute_to_layer_input,
--> 308             output_fn=lambda out: chunk_output_fn(out),
    309         )
    310

i:\languages\python\envs\deel-torch\lib\site-packages\captum\attr\_utils\gradient.py in compute_layer_gradients_and_eval(forward_fn, layer, inputs, target_ind, additional_forward_args, gradient_neuron_index, device_ids, attribute_to_layer_input, output_fn)
    517             for layer_tensor in saved_layer[device_id]
    518         )
--> 519         saved_grads = torch.autograd.grad(torch.unbind(output), grad_inputs)
    520         saved_grads = [
    521             saved_grads[i : i + num_tensors]

i:\languages\python\envs\deel-torch\lib\site-packages\torch\autograd\__init__.py in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused)
    155     return Variable._execution_engine.run_backward(
    156         outputs, grad_outputs, retain_graph, create_graph,
--> 157         inputs, allow_unused)
    158
    159

i:\languages\python\envs\deel-torch\lib\site-packages\captum\attr\_core\deep_lift.py in _backward_hook(self, module, grad_input, grad_output, eps)
    461         multipliers = tuple(
    462             SUPPORTED_NON_LINEAR[type(module)](
--> 463                 module, module.input, module.output, grad_input, grad_output, eps=eps
    464             )
    465         )

i:\languages\python\envs\deel-torch\lib\site-packages\captum\attr\_core\deep_lift.py in maxpool2d(module, inputs, outputs, grad_input, grad_output, eps)
    920         grad_input,
    921         grad_output,
--> 922         eps=eps,
    923     )
    924

i:\languages\python\envs\deel-torch\lib\site-packages\captum\attr\_core\deep_lift.py in maxpool(module, pool_func, unpool_func, inputs, outputs, grad_input, grad_output, eps)
   1002
   1003     new_grad_inp = torch.where(
-> 1004         abs(delta_in) < eps, grad_input[0], unpool_grad_out_delta / delta_in
   1005     )
   1006     # If the module is invalid, save the newly computed gradients

RuntimeError: The size of tensor a (28) must match the size of tensor b (14) at non-singleton dimension 3

It works on all layers except the MaxPool2d layers of vgg16.features (it works with the average pooling layer).

I am not sure if this is a restriction of DeepLift or an error in the implementation?

Also, when the error occurs, the model seems to be left in some weird state as re-using it leads to IndexError: tuple index out of range (even with a brand new captum.attr.LayerDeepLift instance).

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:8 (7 by maintainers)

Top GitHub Comments

3reactions

NarineKcommented, May 28, 2020

@Holt59, this PR #390 will fix the problem with MaxPool. To give more context, this problem happened because in the forward_hook we return cloned output tensor and that made the MaxPool modules complex. Since there is a bug related to complex modules in PyTorch and backward_hook, that is, returned input gradients represent only a subset of inputs, it wasn’t able to compute the multipliers correctly.

More details about the issue can be found here: https://pytorch.org/docs/stable/nn.html#torch.nn.Module.register_backward_hook

Another point that I wanted to bring up is: In VGG the modules might get reused (you might want to check that). We want to make sure that this isn’t happening for the layer algorithms and DeepLift. If the activations get reused. You can simply redefine the architecture (that’s easy to do). More info about it can be found here:

https://github.com/pytorch/captum/issues/378#issuecomment-633309752

1reaction

NarineKcommented, May 16, 2020

Actually what you were doing will be equivalent to:

u = LayerDeepLift(
    model, list(model.features.children())[24]).attribute(
        torch_im, target=156, attribute_to_layer_input=True)[0]

as a workaround

Top Results From Across the Web

Maxpooling Layer causes error in Keras - Stack Overflow

I am using Tensorflow as backend and I have the right input shape of the image in the first layer. Are there any...

Error when using convolution2dLayer between connected ...

I'm trying to create a modified UNet using connected max pooling and max unpooling layers. However, if I put a convolution layer between...

Confusion about pooling layer, is it trainable or not?

Max-pooling layers have no parameters (you can not "train" them). But average-pooling layers often scale the result and add a bias, so they...

Max Pooling in Convolutional Neural Networks explained

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.4 0.6 0.7 0.5 0.4 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0...

MaxPooling2D layer - Keras

tf.keras.layers.MaxPooling2D( pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs ). Max pooling operation for 2D spatial data.