Stream semantics difference between CUDA/HIP
See original GitHub issuehttps://github.com/cupy/cupy/pull/5091#issuecomment-821946143
I think this needs to be documented in CuPy as users are almost impossible to know what’s happening.
We may need to consider providing some remedies for this (e.g., wait until hipStreamQuery
succeeds before destroying a stream when in HIP).
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:23 (23 by maintainers)
Top Results From Across the Web
CUDA Stream Semantics — NCCL 2.16.2 documentation
CUDA Stream Semantics¶. NCCL calls are associated to a stream and are passed as the last argument of the collective communication function.
Read more >Defining the execution semantics of stream processing engines
Organizing the computation into separate operators enables for task parallelism—different operators run on different threads on the same machine ...
Read more >Specify synchronization semantics · Issue #57 · dmlc/dlpack
Taking a step back from CUDA's stream, for supporting synchronization I think DLPack may need to support other modes of scheduling as well...
Read more >Semantics of Data Streams and Operators - Datalab
Data streams arise in many application domains, such as sensor processing, net- work monitoring and financial analysis. Streams from different domains could.
Read more >Enabling Exactly-Once in Kafka Streams | Confluent
This blog post is the third and last in a series about the exactly-once semantics for Apache Kafka®. See Exactly-once Semantics are ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I found that this might be the case. See https://github.com/ROCmSoftwarePlatform/rocBLAS/pull/1171
Paging @ekuznetsov139
Hmm… the original test https://github.com/cupy/cupy/pull/5091 still fails with ROCm 4.2.0. I also checked the code you showed at https://github.com/cupy/cupy/issues/5163#issuecomment-840086050 and it passed on the same environment. It is supposed that I have something to investigate further…