Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does `Copy` compute gradients in reversed order

See original GitHub issue

Niiiice work! I am confused that why Copy compute is in reversed order. https://github.com/kakaobrain/torchgpipe/blob/fca5d65fb68edddf1b056443d014c3aa7f416431/torchgpipe/copy.py#L59-L71

If I write the part, maybe it is implemented directly.

 grad_input:List[Tensor] = []
 input_stream = current_stream(get_device(prev_stream)) 
  
 with use_stream(prev_stream), use_stream(next_stream): 
     for x in grad_output: 
         y = x.to(get_device(prev_stream)) 
         grad_input.append(y) 
  
         # 'next_stream' is not where 'x' has been allocated. 
         record_stream(x, next_stream) 
         # 'y' has been allocated on 'prev_stream'. 
         # It might be used on the current stream captured as 'input_stream'. 
         record_stream(y, input_stream)

Is there something I do not take into account? Could you explain it? Thank you very much.

Issue Analytics

State:
Created 3 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

subleecommented, Jul 31, 2020

Sorry to confuse you.

I’ve usually inspected a CUDA timeline to improve computational performance. The readability of CUDA timeline is important to my use case. When every kernel is ordered consistently, I can easily assume that the forward and backward timeline is symmetric with each other. There is no other big advantage of consistent ordering but also, I think, there is no reason to order kernels inconsistently.

0reactions

MlWoocommented, Aug 3, 2020

@sublee Thanks a lot. I will close the issue.