question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation

See original GitHub issue

I am trying to use GCN and GAT from library and getting this error :

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch_scatter/scatter.py", line 21, in softmax
            size[dim] = int(index.max()) + 1
        out = torch.zeros(size, dtype=src.dtype, device=src.device)
        return out.scatter_add_(dim, index, src)
               ~~~~~~~~~~~~~~~~ <--- HERE
    else:
        return out.scatter_add_(dim, index, src)
RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation if that's acceptable for your application. You can also file an issue at https://github.com/pytorch/pytorch/issues to help us prioritize adding deterministic support for this operation.

I tried to add torch.use_deterministic_algorithms(True) but not working Same code is working well on CPU. How I can avoid this error?

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:24 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
cxw-droidcommented, Jul 16, 2022

Hi, @rusty1s I try to make mutag_gin.py output a deterministic result. Following the above suggestions, I have changed the dataset to SparseTensor, edge_index to adj_t and changed line 53 to x = global_max_pool(x, batch), but I still got random result. I set the seed as follows:

seed = 2
torch.manual_seed(seed)  ##
# np.random.seed(seed)
# random.seed(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.use_deterministic_algorithms(True)

Any help would be appreciated. Thanks.

1reaction
andrei-rusucommented, Mar 5, 2022

Well, from my understanding GINEConv requires the edge_attr to have a dimensionality of in_channels. GATConv, on the other hand, requires edge_attr to have a dimensionality of heads * out_channels. I don’t see why one wouldn’t be able to create their own projections in order to ensure edge_attr has the correct size for each of these cases. I agree, however, that not enforcing edge_dim may be confusing to some (and the documentation is already explicit about this internal Linear layer), so I don’t really have a problem with it as long as add_self_loops works fine.

And yep, the final solution should be what you said in the second paragraph! However, as per your previous comment, a temporary workaround could be to just add the self loops as part of the Transform or directly to the original edge_index & edge_attr tensors before creating the SparseTensor. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add a deterministic version of scatter_add_cuda_kernel ...
RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.set_deterministic(True)'. You can ...
Read more >
scatter_add_cuda_kernel does not have a deterministic ...
RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'.
Read more >
What does the difference between 'torch.backends.cudnn ...
1 Answer 1 · 1. this is gold: As the documentation states, some of the listed operations don't have a deterministic implementation. So...
Read more >
PyTorch 1.7.0 Now Available | Exxact Blog
More precisely, when this flag is true: Operations known to not have a deterministic implementation throw a runtime error;; Operations with ...
Read more >
tf.config.experimental.enable_op_determinism - TensorFlow
Certain ops will raise an UnimplementedError because they do not yet have a deterministic implementation. Additionally, due to bugs, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found