How to all_gather Tensor if not the same length
See original GitHub issue❓ Questions/Help/Support
Hi @vfdev-5 ,
I am developing distributed evaluation feature, and facing a problem that the preds
and labels
on different GPUs don’t have the same length, then ignite.idist.all_gather()
can’t work. For example: GPU0 has 5 images to handle, GPU1 has 4 images, total=9 images.
Could you please help on how to idist.all_gather()
the values?
I don’t want to pad data for the input to make it evenly-divisible, because it will cause the metrics different on single GPU and multi-GPUs.
Thanks in advance.
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (7 by maintainers)
Top Results From Across the Web
Gather/Concatenate tensor arrays of different lengths/sizes
Use dist.all_gather to get sizes of all arrays. Find the max size. Pad local array to max size using zeros/constants. Use ...
Read more >horovod.torch.mpi_ops
If name is not provided, an incremented auto-generated name is used. The tensor type and shape must be the same on all Horovod...
Read more >Distributed communication package - torch.distributed - PyTorch
This function requires that all processes in the main group (i.e. all processes that are part of the distributed job) enter this function,...
Read more >Operation Semantics | XLA - TensorFlow
If a replica id is not a target in any pair, then the output on that replica is a tensor consists of 0(s)...
Read more >Parallel computing - Pytorch distributed - Google Sites
Pytorch has several supports for distributed version which is similar to MPI. ... def run(rank, size): tensor = torch.zeros(1) if rank == 0:...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @vfdev-5 ,
Your example code looks good, and I developed a
evenly_divisible_all_gather()
in MONAI now to handle this case:Thanks.
I compared your program with mine and found the root cause, sorry for my mistake, the
all_gather
works for my string now. Thanks very much for your help and example program!!!