Cannot handle the ‘mul' operation across layers
See original GitHub issueDescribe the issue:
Cannot handle the mul
operation across layers.
When we try to prune this model, we found that NNI can speed up this model but failed forward propagation. By the way, it occurres in mobilenet_v3
in torchvision
.
Environment:
- NNI version: 2.6
- Training service (local|remote|pai|aml|etc): local
- Server OS (for remote mode only):
- Python version: 3.7
- PyTorch/TensorFlow version: pytorch 1.10.0
- Is conda/virtualenv/venv used?: yes
- Is running in Docker?: no
Codes:
import torch, torchvision
from nni.algorithms.compression.v2.pytorch.pruning import L1NormPruner
from nni.compression.pytorch.speedup import ModelSpeedup
from nni.compression.pytorch.utils import not_safe_to_prune
# Load PyTorch model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
class Net(torch.nn.Module):
def __init__(self,):
super(Net, self).__init__()
self.avgpool = torch.nn.AdaptiveAvgPool2d(1)
self.input = torch.nn.Conv2d(3, 8, 3)
self.bn = torch.nn.BatchNorm2d(8)
self.fc1 = torch.nn.Conv2d(8, 16, 1)
self.fc2 = torch.nn.Conv2d(16, 8, 1)
self.activation = torch.nn.ReLU()
self.scale_activation = torch.nn.Hardsigmoid()
self.out = torch.nn.Conv2d(8, 12, 1)
def forward(self, input):
input = self.activation(self.bn(self.input(input)))
scale = self.avgpool(input)
out1 = self.activation(self.fc1(scale))
out1 = self.scale_activation(self.fc2(out1))
return self.out(out1 * input)
model = Net()
model.eval()
im = torch.ones(1, 3, 512, 512).to(device)
#
# with torch.no_grad():
# input_name = ['input']
# output_name = ['output']
# onnxname = 'Net_original.onnx'
# torch.onnx.export(model, im, onnxname, input_names = input_name, output_names = output_name,
# opset_version=11, training=False, verbose=False, do_constant_folding=False)
# print(f'successful export onnx {onnxname}')
# exit()
y = model(im)
torch.jit.trace(model, im, strict=False)
not_safe = not_safe_to_prune(model, im)
print('\n' + '=' * 50 + 'not_safe' + '=' * 50, not_safe)
cfg_list = []
for name, module in model.named_modules():
if name in not_safe:
continue
if isinstance(module, torch.nn.Conv2d):
cfg_list.append({'op_types':['Conv2d'], 'sparsity':0.3, 'op_names':[name]})
print('cfg_list', cfg_list)
pruner = L1NormPruner(model, cfg_list)
_, masks = pruner.compress()
pruner.show_pruned_weights()
pruner._unwrap_model()
ModelSpeedup(model, dummy_input=im, masks_file=masks).speedup_model()
print(model)
torch.jit.trace(model, im, strict=False)
Log message:
==================================================not_safe================================================== [] cfg_list [{‘op_types’: [‘Conv2d’], ‘sparsity’: 0.3, ‘op_names’: [‘input’]}, {‘op_types’: [‘Conv2d’], ‘sparsity’: 0.3, ‘op_names’: [‘fc1’]}, {‘op_types’: [‘Conv2d’], ‘sparsity’: 0.3, ‘op_names’: [‘fc2’]}, {‘op_types’: [‘Conv2d’], ‘sparsity’: 0.3, ‘op_names’: [‘out’]}] [2022-02-10 17:02:20] INFO (nni.algorithms.compression.v2.pytorch.base.pruner/MainThread) simulated prune input remain/total: 6/8 [2022-02-10 17:02:20] INFO (nni.algorithms.compression.v2.pytorch.base.pruner/MainThread) simulated prune fc1 remain/total: 12/16 [2022-02-10 17:02:20] INFO (nni.algorithms.compression.v2.pytorch.base.pruner/MainThread) simulated prune fc2 remain/total: 6/8 [2022-02-10 17:02:20] INFO (nni.algorithms.compression.v2.pytorch.base.pruner/MainThread) simulated prune out remain/total: 9/12 [2022-02-10 17:02:24] INFO (nni.compression.pytorch.speedup.compressor/MainThread) start to speed up the model [2022-02-10 17:02:26] INFO (FixMaskConflict/MainThread) {‘input’: 1, ‘fc1’: 1, ‘fc2’: 1, ‘out’: 1} [2022-02-10 17:02:26] INFO (FixMaskConflict/MainThread) dim0 sparsity: 0.250000 [2022-02-10 17:02:26] INFO (FixMaskConflict/MainThread) dim1 sparsity: 0.000000 [2022-02-10 17:02:26] INFO (FixMaskConflict/MainThread) Dectected conv prune dim" 0 [2022-02-10 17:02:26] INFO (nni.compression.pytorch.speedup.compressor/MainThread) infer module masks… [2022-02-10 17:02:26] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for input [2022-02-10 17:02:26] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for bn [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for activation [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for avgpool [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for fc1 [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for activation [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for fc2 [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for scale_activation [2022-02-10 17:02:27] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for .aten::mul.9 [2022-02-10 17:02:28] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for out [2022-02-10 17:02:29] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the out /Users/maxin/anaconda3/envs/torch/lib/python3.7/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won’t be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at aten/src/ATen/core/TensorBody.h:417.) return self._grad [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the .aten::mul.9 [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the scale_activation [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the fc2 [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the activation.1 [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the fc1 [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the avgpool [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the activation [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the bn [2022-02-10 17:02:30] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the input [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) resolve the mask conflict [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace compressed modules… [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: input, op_type: Conv2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: bn, op_type: BatchNorm2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compress_modules/MainThread) replace batchnorm2d with num_features: 6 [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: activation, op_type: ReLU) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: avgpool, op_type: AdaptiveAvgPool2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: fc1, op_type: Conv2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: activation, op_type: ReLU) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: fc2, op_type: Conv2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: scale_activation, op_type: Hardsigmoid) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Warning: cannot replace (name: .aten::mul.9, op_type: aten::mul) which is func type [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) replace module (name: out, op_type: Conv2d) [2022-02-10 17:02:31] INFO (nni.compression.pytorch.speedup.compressor/MainThread) speedup done Net( (avgpool): AdaptiveAvgPool2d(output_size=1) (input): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1)) (bn): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (fc1): Conv2d(6, 12, kernel_size=(1, 1), stride=(1, 1)) (fc2): Conv2d(12, 5, kernel_size=(1, 1), stride=(1, 1)) (activation): ReLU() (scale_activation): Hardsigmoid() (out): Conv2d(6, 9, kernel_size=(1, 1), stride=(1, 1)) ) Traceback (most recent call last): File “master/2.py”, line 68, in <module> torch.jit.trace(model, im, strict=False) File “anaconda3/envs/torch/lib/python3.7/site-packages/torch/jit/_trace.py”, line 750, in trace _module_class, File “anaconda3/envs/torch/lib/python3.7/site-packages/torch/jit/_trace.py”, line 965, in trace_module argument_names, File “anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl return forward_call(*input, **kwargs) File “anaconda3/envs/torch/lib/python3.7/site-packages/torch/nn/modules/module.py”, line 1090, in _slow_forward result = self.forward(*input, **kwargs) File “master/2.py”, line 27, in forward return self.out(out1 * input) RuntimeError: The size of tensor a (5) must match the size of tensor b (6) at non-singleton dimension 1
Process finished with exit code 1
How can we solve this problem? Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Reactions:3
- Comments:7 (4 by maintainers)
Top GitHub Comments
hi @zheng-ningxin it works well!Thank you for your patient guidance ~ 😊😄 This confidence is actually used as a batch size parameter in the code. According to your explanation, I understand that this pruning process is to modify the convolution kernel of shape inference(modify the mask) for each batch input until all batches are inferred correctly and then fixed the convolution kernel shape, after which real hard pruning is performed. Is this understanding correct?
https://github.com/microsoft/nni/pull/4594