Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: _cdist_backward requires X2 to be contiguous

See original GitHub issue

Hi, I am trying to train your addernet, but it returns me one runtime error, which I supposed attributes to .continuous() function or some other uncommon operations used in your adder.py.

Could you help to solve this issue?

Issue Analytics

State:
Created 4 years ago
Comments:18

Top GitHub Comments

2reactions

buttercuttercommented, Apr 17, 2020

@ranery I saw there are two cdist() implementations online (code 1 , code 2)

def fast_cdist(x1, x2):
    adjustment = x1.mean(-2, keepdim=True)
    x1 = x1 - adjustment
    x2 = x2 - adjustment  # x1 and x2 should be identical in all dims except -2 at this point

    # Compute squared distance matrix using quadratic expansion
    # But be clever and do it with a single matmul call
    x1_norm = x1.pow(2).sum(dim=-1, keepdim=True)
    x1_pad = torch.ones_like(x1_norm)
    x2_norm = x2.pow(2).sum(dim=-1, keepdim=True)
    x2_pad = torch.ones_like(x2_norm)
    x1_ = torch.cat([-2. * x1, x1_norm, x1_pad], dim=-1)
    x2_ = torch.cat([x2, x2_pad, x2_norm], dim=-1)
    res = x1_.matmul(x2_.transpose(-2, -1))

    # Zero out negative values
    res.clamp_min_(1e-30).sqrt_()
    return res

import time
import torch


@torch.jit.script
def my_cdist(x1, x2):
    x1_norm = x1.pow(2).sum(dim=-1, keepdim=True)
    x2_norm = x2.pow(2).sum(dim=-1, keepdim=True)
    res = torch.addmm(x2_norm.transpose(-2, -1), x1, x2.transpose(-2, -1), alpha=-2).add_(x1_norm)
    res = res.clamp_min_(1e-30).sqrt_()
    return res


a = torch.randn(10000, 9).cuda()
b = torch.randn(30000, 9).cuda()

for i in range(5):
    start_time = time.time()
    res = torch.cdist(a, b)
    torch.cuda.synchronize()
    print(f'torch cdist time {i}: {time.time() - start_time:.2f}s')

for i in range(5):
    start_time = time.time()
    res2 = my_cdist(a, b)
    torch.cuda.synchronize()
    print(f'my cdist time {i}: {time.time() - start_time:.2f}s')

1reaction

ranerycommented, Apr 30, 2020

@ranery Could you advise more about #16 (comment) in which conv2d() is the root cause instead of cdist() ?

This is because when calculating the gradient and error, current pytorch-based solution considers unfolding to collect and subtract corresponding block of both the feature map and weight filter, leading to unnecessary memory consumption, I suppose the way they use in their paper is cuda version which directly convolute through the feature map, in which way the memory consumption is normal (similar to multiplication-based convolution networks).

Top Results From Across the Web

Understanding cdist() function - PyTorch Forums

How can I eliminate GPU out-of-memory runtime error reported in new_dist() ? ... RuntimeError: _cdist_backward requires X2 to be contiguous.

What does .contiguous() do in PyTorch? - Stack Overflow

The contiguous() function is usually required when we first transpose() a tensor and then ... RuntimeError Traceback (most recent call last) ...

RuntimeError: Function CdistBackward returned an invalid ...

原因：调用Pytorch.cdist().sum() 计算Loss有问题，正确的Lorm计算为torch.dist或者torch.norm(x-y)；

https://fossies.org/linux/pytorch/test/test_torch....

assertRaisesRegex(RuntimeError, r'Expected a Storage of type'): error_storage ... length).to(device).contiguous() input_ = conv(input_).contiguous() input_ ...

发布 · mirrors / pytorch / pytorch - GitCode

If you require a contiguous output, you can pass the ... Fix torch.cdist backward CUDA error due to illegal gridDim setting (#51569); Prevent...