Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Seeking suggestions for embedding into ddp

See original GitHub issue

Hi, PyTorch 1.8 have this new hook torch.nn.parallel.DistributedDataParallel.register_comm_hook(), any advices on how to integrate grace into ddp using the dist examples?

Issue Analytics

State:
Created 2 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

hangxu0304commented, Aug 4, 2021

dist.all_reduce simply sums up the dense inputs across the nodes. It doesn’t support value-index pair sparse tensor format. To perform allreduce for sparse tensors, you need to use allgather to collect all sparse tensors and cast them into dense format locally, then sum them up. In grace, we have developed allgather for tensors with different length, please check here.

0reactions

hangxu0304commented, Aug 9, 2021

Thank you very much. Your implementation is really good. I do have the following 2 suggestions for your optimization:

In DgcCompressor.decompress(), there is a For Loop to cast the sparse tensor into dense format, which can be very expensive in case of large gradients. You may want to use this api scatter_add to gain some speed.
Index and value allgather operations can be merged since you already concat them after DGC compression.

Regarding the poor performance, are you comparing with GRACE DGC or the no compression baseline? And also please note that, gradient compression is not always beneficial due to various model architectures, network conditions, and different number of nodes. Could you please specify your testing environment?

Top Results From Across the Web

How to Make DDP Files for Free - James Z. Productions

In Finder or File Explorer, select all the Wave files that are to be printed to CDs, drag them into REAPER, and drop...

EMBEDDING INTEGRATED PRODUCT DEVELOPMENT ...

Alternatives that meet all, or the most important, performance criteria can then be evaluated based on estimations and predictions of DDP values, along...

Dyadic Developmental Psychotherapy (DDP)

Adults of all ages also tend to seek their attachment figure – partner, best friend, parent, sibling – when they experi- ence a...

Feedback systems in the design and development process

The process model integrates ideas from some previously distinct perspectives to indicate how ... Feedback and goal-seeking in DDP activity.

How can I create an inset map in DDP that zooms to a feature ...

In your second data frame you can go to its Properties and on the Data Frame tab there is an Extent dropdown which...