Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pruning yolox model with [aten::arange/aten::meshgrid/aten::stack] is not Supported!

See original GitHub issue

When I prune YOLOX model I get an error info.

I need to prune yolox model using torch.arange/torch.meshgrid/torch.stack in yolox_head.

Triggered error during model speed-up phase:

ERROR (nni.compression.pytorch.speedup.jit_translate/MainThread) aten::arange is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.aten::arange.288
ERROR (nni.compression.pytorch.speedup.jit_translate/MainThread) aten::arange is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for backbone.C3_n3.conv3.conv
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.aten::meshgrid.289
ERROR (nni.compression.pytorch.speedup.jit_translate/MainThread) aten::meshgrid is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for backbone.C3_n3.conv3.bn
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.prim::ListUnpack.290
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for backbone.C3_n3.conv3.act
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.aten::stack.291
ERROR (nni.compression.pytorch.speedup.jit_translate/MainThread) aten::stack is not Supported! Please report an issue at https://github.com/microsoft/nni. Thanks~
INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for backbone.bu_conv1.conv

nni Environment :

nni version: 2.7
nni mode(local|pai|remote): local
OS: Ubuntu 18.04
python version: 3.8
is conda or virtualenv used?: YES
is running in docker?: NO

Try to modify in jit_translate.py: create a fun:

def arange_python(node, speedup):
    class arangeModule(torch.nn.Module):
        def forward(self, x):
            return torch.arange(int(x))
    return arangeModule

and add the maping 'aten::arange': arange_python in trans_from_jit_to_python but still error:

[2022-05-25 10:14:08] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.aten::arange.287
2022-05-25 10:14:08 | ERROR    | yolox.core.launch:98 - An error has been caught in function 'launch', process 'MainProcess' (5735), thread 'MainThread' (140500556732224):
Traceback (most recent call last):

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 315, in <module>
    launch(
    └ <function launch at 0x7fc7fcfa4280>

> File "/home/yq/pycharm_project/YOLOX-main/yolox/core/launch.py", line 98, in launch
    main_func(*args)
    │          └ (╒═══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════...
    └ <function main at 0x7fc7efe648b0>

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 154, in main
    model = prune_main(model, "cuda:0")
            │          └ YOLOX(
            │              (backbone): YOLOPAFPN(
            │                (backbone): CSPDarknet(
            │                  (stem): Focus(
            │                    (conv): BaseConv(
            │                      (conv): ...
            └ <function prune_main at 0x7fc7ef859e50>

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 298, in prune_main
    ModelSpeedup(model, torch.rand(1, 3, 640, 640), masks).speedup_model()
    │            │      │     │                     └ {'backbone.backbone.stem.conv.conv': {'weight': tensor([[[[1., 1., 1.],
    │            │      │     │                                 [1., 1., 1.],
    │            │      │     │                                 [1., 1., 1.]],
    │            │      │     │                       
    │            │      │     │                          ...
    │            │      │     └ <built-in method rand of type object at 0x7fc8accc4520>
    │            │      └ <module 'torch' from '/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/torch/__init__.py'>
    │            └ YOLOX(
    │                (backbone): YOLOPAFPN(
    │                  (backbone): CSPDarknet(
    │                    (stem): Focus(
    │                      (conv): BaseConv(
    │                        (conv): ...
    └ <class 'nni.compression.pytorch.speedup.compressor.ModelSpeedup'>

  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 512, in speedup_model
    self.infer_modules_masks()
    │    └ <function ModelSpeedup.infer_modules_masks at 0x7fc7e0a6c700>
    └ <nni.compression.pytorch.speedup.compressor.ModelSpeedup object at 0x7fc7c71e93d0>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 355, in infer_modules_masks
    self.update_direct_sparsity(curnode)
    │    │                      └ name: head.aten::arange.287, type: func, op_type: aten::arange, sub_nodes: ['__module.head_aten::arange'], inputs: ['6522'], ...
    │    └ <function ModelSpeedup.update_direct_sparsity at 0x7fc7e0a6c550>
    └ <nni.compression.pytorch.speedup.compressor.ModelSpeedup object at 0x7fc7c71e93d0>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 216, in update_direct_sparsity
    _auto_infer = AutoMaskInference(
                  └ <class 'nni.compression.pytorch.speedup.infer_mask.AutoMaskInference'>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 83, in __init__
    self.output = self.module(*dummy_input)
    │             │    │       └ [80]
    │             │    └ <class 'nni.compression.pytorch.speedup.jit_translate.arange_python.<locals>.arangeModule'>
    │             └ <nni.compression.pytorch.speedup.infer_mask.AutoMaskInference object at 0x7fc7c6decca0>
    └ <nni.compression.pytorch.speedup.infer_mask.AutoMaskInference object at 0x7fc7c6decca0>

TypeError: __init__() takes 1 positional argument but 2 were given

I don’t know how to fix it, hope to get your help 🍺

Issue Analytics

State:
Created a year ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

qyang1996commented, May 27, 2022

Or you can leave the model here, and I’ll support the corresponding operators for you, but I still encourage you to have a try and contribute back to NNI, thanks.

thank u, I want to try it myself first. ✊

1reaction

qyang1996commented, May 27, 2022

I’m very sorry, I found out that this is a bug created by me while modifying the nni source code. I RE-installed the nni and only modified the jit_translate.py define the arange_python:

def arange_python(node, speedup):
    class arangeModule(torch.nn.Module):
        def forward(self, x):
            print('current input is :', x)
            import pdb;
            pdb.set_trace()
            return torch.arange(int(x))
    return arangeModule()

add 'aten::arange': arange_python in trans_from_jit_to_python

The real debug info is as follows:

[2022-05-27 16:37:03] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for head.aten::arange.287
current input is : tensor([80])
> /home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/jit_translate.py(121)forward()
-> return torch.arange(int(x))
(Pdb) x
tensor([80])
(Pdb) continue
current input is : tensor([0])
> /home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/jit_translate.py(121)forward()
-> return torch.arange(int(x))
(Pdb) x
tensor([0])
(Pdb) continue
2022-05-27 16:37:46 | ERROR    | yolox.core.launch:98 - An error has been caught in function 'launch', process 'MainProcess' (13614), thread 'MainThread' (139982614955840):
Traceback (most recent call last):

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 249, in <module>
    launch(
    └ <function launch at 0x7f4f653dd160>

> File "/home/yq/pycharm_project/YOLOX-main/yolox/core/launch.py", line 98, in launch
    main_func(*args)
    │          └ (╒═══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════...
    └ <function main at 0x7f4f5829d790>

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 157, in main
    model = prune_main(model, "cuda:0")
            │          └ YOLOX(
            │              (backbone): YOLOPAFPN(
            │                (backbone): CSPDarknet(
            │                  (stem): Focus(
            │                    (conv): BaseConv(
            │                      (conv): ...
            └ <function prune_main at 0x7f4f57c94d30>

  File "/home/yq/pycharm_project/YOLOX-main/tools/prune.py", line 232, in prune_main
    ModelSpeedup(model, torch.rand(1, 3, 640, 640), masks).speedup_model()
    │            │      │     │                     └ {'backbone.backbone.stem.conv.conv': {'weight': tensor([[[[0., 0., 0.],
    │            │      │     │                                 [0., 0., 0.],
    │            │      │     │                                 [0., 0., 0.]],
    │            │      │     │                       
    │            │      │     │                          ...
    │            │      │     └ <built-in method rand of type object at 0x7f50150ff520>
    │            │      └ <module 'torch' from '/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/torch/__init__.py'>
    │            └ YOLOX(
    │                (backbone): YOLOPAFPN(
    │                  (backbone): CSPDarknet(
    │                    (stem): Focus(
    │                      (conv): BaseConv(
    │                        (conv): ...
    └ <class 'nni.compression.pytorch.speedup.compressor.ModelSpeedup'>

  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 512, in speedup_model
    self.infer_modules_masks()
    │    └ <function ModelSpeedup.infer_modules_masks at 0x7f4f34ea3670>
    └ <nni.compression.pytorch.speedup.compressor.ModelSpeedup object at 0x7f4f2f685040>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 355, in infer_modules_masks
    self.update_direct_sparsity(curnode)
    │    │                      └ name: head.aten::arange.287, type: func, op_type: aten::arange, sub_nodes: ['__module.head_aten::arange'], inputs: ['6522'], ...
    │    └ <function ModelSpeedup.update_direct_sparsity at 0x7f4f34ea34c0>
    └ <nni.compression.pytorch.speedup.compressor.ModelSpeedup object at 0x7f4f2f685040>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 229, in update_direct_sparsity
    _auto_infer.update_direct_sparsity()
    │           └ <function AutoMaskInference.update_direct_sparsity at 0x7f4f34e8e9d0>
    └ <nni.compression.pytorch.speedup.infer_mask.AutoMaskInference object at 0x7f4f3ef476d0>
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 331, in update_direct_sparsity
    out_mask, constant = self.isconstants(out.clone().detach())
    │                    │    │           │   └ <method 'clone' of 'torch._C._TensorBase' objects>
    │                    │    │           └ tensor([], dtype=torch.int64)
    │                    │    └ <function AutoMaskInference.isconstants at 0x7f4f34e8e8b0>
    │                    └ <nni.compression.pytorch.speedup.infer_mask.AutoMaskInference object at 0x7f4f3ef476d0>
    └ None
  File "/home/yq/anaconda3/envs/yolox-py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 227, in isconstants
    same = tout[:] == tout[0]
           │          └ tensor([], dtype=torch.int64)
           └ tensor([], dtype=torch.int64)

IndexError: index 0 is out of bounds for dimension 0 with size 0

Process finished with exit code 0

It seems to call arange_python twice in the speedup/compressor.py

Init class AutoMaskInference for the first time.

_auto_infer = AutoMaskInference(func, dummy_input, in_masks, in_constants=in_constants, batch_dim=self.batch_dim)

dummy_input is assigned to [tensor[0]] and is called a second time in member function update_direct_sparsity().

_auto_infer.update_direct_sparsity()

Finally, arange_python returns an empty tensor which caused sebsequent errors.

I don’t know how to fix it, please help.

Top Results From Across the Web

How to Train YOLOX On a Custom Dataset - Roboflow Blog

In this post, we will walk through how you can train YOLOX to recognize object detection data for your custom use case.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Pruning yolox model with [aten::arange/aten::meshgrid/aten::stack] is not Supported!

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Supported for torch.sum

Error: package directory 'nni_node' does not exist when pip installing from GitHub