Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ONNX error while trying to add_graph

See original GitHub issue

Hi,

I’m experimenting an issue (which probably comes from my inexperience in using this kind of tools) with the add_graph method.

I have a GAN, which is composed of two models, the Generator and the Discriminator. I’d like to draw both but I’d be more than happy if I could graph just one.

The generator is based on an encoder-decoder framework, involving LSTMs and a pooling mechanism to extrapolate information in the latent space. The structure of the generator looks something like this:

TrajectoryGenerator(
  (encoder): Encoder(
    (encoder): LSTM(64, 32)
    (spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
  )
  (decoder): Decoder(
    (decoder): LSTM(64, 32)
    (spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
    (hidden2pos): Linear(in_features=32, out_features=2, bias=True)
  )
  (pool_net): PoolHiddenNet(
    (spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
    (mlp_pre_pool): Sequential(
      (0): Linear(in_features=96, out_features=512, bias=True)
      (1): ReLU()
      (2): Linear(in_features=512, out_features=32, bias=True)
      (3): ReLU()
    )
  )
  (mlp_decoder_context): Sequential(
    (0): Linear(in_features=64, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=32, bias=True)
    (3): ReLU()
  )
)

The forward method is declared as:

def forward(self, obs_traj, obs_traj_rel, seq_start_end, user_noise=None):
        """
        Inputs:
        - obs_traj: Tensor of shape (obs_len, batch, 2)
        - obs_traj_rel: Tensor of shape (obs_len, batch, 2)
        - seq_start_end: A list of tuples which delimit sequences within batch.
        - user_noise: Generally used for inference when you want to see
        relation between different types of noise and outputs.
        Output:
        - pred_traj_rel: Tensor of shape (self.pred_len, batch, 2)
        """

Coming to the problem: When I run my train.py I build my generator = TrajectoryGenerator(...) object and, since I want to draw the graph only once (before the actual training procedure starts) I generate some kind of dummy input to be fed to the generator:

generator = TrajectoryGenerator(...)
obs_traj = ...
obs_traj_rel = ...
seq_start_end = ...
with SummaryWriter(comment='Generator') as w:
    w.add_graph(generator, (obs_traj, obs_traj_rel, seq_start_end), verbose=True)

Leads to

Traceback (most recent call last):
  File "scripts/train.py", line 620, in <module>
    main(args)
  File "scripts/train.py", line 196, in main
    w.add_graph(generator, (obs_traj, obs_traj_rel, seq_start_end), verbose=True)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/tensorboardX/writer.py", line 566, in add_graph
    self.file_writer.add_graph(graph(model, input_to_model, verbose))
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 235, in graph
    _optimize_trace(trace, torch.onnx.utils.OperatorExportTypes.ONNX)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 175, in _optimize_trace
    trace.set_graph(_optimize_graph(trace.graph(), operator_export_type))
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 206, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/__init__.py", line 52, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/utils.py", line 504, in _run_symbolic_function
    return fn(g, *inputs, **attrs)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/symbolic.py", line 1351, in randn
    return g.op('RandomNormal', shape_i=shape)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/utils.py", line 452, in _graph_op
    n = g.insertNode(_newNode(g, opname, outputs, *args, **kwargs))
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/utils.py", line 405, in _newNode
    _add_attribute(n, k, v, aten=aten)
  File "/equilibrium/***/anaconda3/envs/progetto/lib/python3.6/site-packages/torch/onnx/utils.py", line 383, in _add_attribute
    return getattr(node, kind + "_")(name, value)
TypeError: i_(): incompatible function arguments. The following argument types are supported:
    1. (self: torch._C.Node, arg0: str, arg1: int) -> torch._C.Node

Invoked with: %2006 : Tensor = onnx::RandomNormal(), scope: TrajectoryGenerator
, 'shape', 2002 defined in (%2002 : int[] = prim::ListConstruct(%2000, %2001), scope: TrajectoryGenerator
) (occurred when translating randn)

Not really sure if it’s an issue related to tensorboardX or not, can you give me some feedback? Let me point out that calling generator(obs_pred, obs_pred_rel, seq_start_end) does not throw any kind of error.

Versions:

pytorch : 1.0.1 torchvision: 0.2.1 python: 3.6.8 tensorflow: 1.12.0 tensorboard: 1.12.2 tensorboardX: 1.6

Thank you for your efforts!

Issue Analytics

State:
Created 5 years ago
Comments:9 (3 by maintainers)

Top GitHub Comments

6reactions

qmpzzpmqcommented, Feb 28, 2019

Same problem, I guess it is from onnx, since It seems working through onnx API and even I couldn’t save onnx model.

4reactions

JurijsNazarovscommented, Mar 10, 2019

For those, who will look here later: onnx-1.4.1 still does not support RandomNormal()

Top Results From Across the Web

Pytorch - ONNX graph creation error (in SummaryWriter ...

Pytorch - ONNX graph creation error (in SummaryWriter add_graph) when ... But I still get the error if I try to add graph...

I am trying to convert the ONNX SSD mobilnet v3 model into ...

It is working well but when I tried to convert the ONNX model into TensorRT Engine. I am getting the below error. Building...

pytorch - Error in loading ONNX model with ONNXRuntime

I'm converting a customized Pytorch model to ONNX. However, when loading it with ONNXRuntime, I've encountered an error as follows:

What is the opset number? — sklearn-onnx 1.11.2 ...

What is the opset number?#. Every library is versioned. scikit-learn may change the implementation of a specific model. That happens for example with...

graph resolve error while compiling onnx model using TIDL ...

After exporting to onnx and trying to compile on TDA4VM I get following error: RuntimeExceptionTraceback (most recent call last) ...