question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Correlation always zero if multiple GPUs

See original GitHub issue

The following code snippet reproduces the bug:

import torch
from spatial_correlation_sampler import spatial_correlation_sample


def run_spatial_corr(rank):
    corr = spatial_correlation_sample(torch.ones(1, 512, 12, 27).to(f"cuda:{rank}"),
                                      torch.ones(1, 512, 12, 27).to(f"cuda:{rank}")).mean()
    print(corr)

run_spatial_corr(0)
run_spatial_corr(1)

The expected output is:

tensor(512., device='cuda:0')
tensor(512., device='cuda:1')

However, it returns:

tensor(512., device='cuda:0')
tensor(0., device='cuda:1')

The output is as expected if the device ordinals are the same or everything is executed on the CPU. I run the code with Python 3.7 and PyTorch 1.2.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:15 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
ClementPinardcommented, Dec 8, 2020

There is indeed something fishy here. We should clarify this on pytorch repo. Maybe we’ll see what they have to say regarding issues on tutorials such as the one you linked or https://github.com/pytorch/tutorials/issues/1196

0reactions
InnovArulcommented, Dec 8, 2020

I have this fundamental doubt.

Should the custom kernel creators take care of setting the Guard (in this case, we can add it to the pytorch tutorial)? or Should pytorch itself take care of it internally in some way or provide a user-friendly API to set the device?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why do we average out correlation matrices from different ...
The cross correlations between samples on different GPUs are not computed. It is not clear to me, why the different cross correlation matrices ......
Read more >
Does zero correlation always imply that the two variables are ...
If we find that two variables are not correlated ( correlation coefficient is very weak or exactly 0) in a large population, then...
Read more >
Accelerating Correlation with GPUs - MathWorks
This example shows how to use a GPU to accelerate cross-correlation. Many correlation problems involve large data sets and can be solved much...
Read more >
A GPU Implementation of the Correlation Technique for Real ...
The correlation technique is applied as a convolution with multiple finite impulse response (FIR) filters in the Fourier domain.
Read more >
CUDA-Zero: a framework for porting shared memory GPU ...
the size of the matrix cannot exceed 16k if the GPU device has 6Gbyes of device memory. It's necessary to scale to multiple...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found