RuntimeError: _cdist_backward requires X2 to be contiguous
See original GitHub issueHi, I am trying to train your addernet, but it returns me one runtime error, which I supposed attributes to .continuous()
function or some other uncommon operations used in your adder.py
.
Could you help to solve this issue?
Issue Analytics
- State:
- Created 4 years ago
- Comments:18
Top Results From Across the Web
Understanding cdist() function - PyTorch Forums
How can I eliminate GPU out-of-memory runtime error reported in new_dist() ? ... RuntimeError: _cdist_backward requires X2 to be contiguous.
Read more >What does .contiguous() do in PyTorch? - Stack Overflow
The contiguous() function is usually required when we first transpose() a tensor and then ... RuntimeError Traceback (most recent call last) ...
Read more >RuntimeError: Function CdistBackward returned an invalid ...
原因:调用Pytorch.cdist().sum() 计算Loss有问题,正确的Lorm计算为torch.dist或者torch.norm(x-y);
Read more >https://fossies.org/linux/pytorch/test/test_torch....
assertRaisesRegex(RuntimeError, r'Expected a Storage of type'): error_storage ... length).to(device).contiguous() input_ = conv(input_).contiguous() input_ ...
Read more >发布 · mirrors / pytorch / pytorch - GitCode
If you require a contiguous output, you can pass the ... Fix torch.cdist backward CUDA error due to illegal gridDim setting (#51569); Prevent...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ranery I saw there are two
cdist()
implementations online (code 1 , code 2)This is because when calculating the gradient and error, current pytorch-based solution considers unfolding to collect and subtract corresponding block of both the feature map and weight filter, leading to unnecessary memory consumption, I suppose the way they use in their paper is cuda version which directly convolute through the feature map, in which way the memory consumption is normal (similar to multiplication-based convolution networks).