Detection of device synchronization
See original GitHub issueDevice synchronization is a source of performance degradation, but currently even some of CuPy functions implicitly (whether intentionally or not) do that (#2797).
It would be nice to have a mechanism to detect device synchronization.
Proposal
Add a context manager within which any operation that synchronizes the device causes an error. The context manager sets a thread-local configuration variable.
with cupyx.allow_synchronize(False):
... # operations that are not supposed to synchronize
In low-level CuPy code that cause synchronization like ndarray.get
, insert _declare_synchronize()
call as follows:
cpdef _declare_synchronize():
... # Raise an error if within allow_synchronize(False).
cdef class ndarray:
cpdef get(self, ...):
_declare_synchronize()
...
Application in unit tests
Given the above implemented, we could apply it in testing.numpy_cupy_equal
for example. If the test code has expected device synchronization, the test implementation can enclose the code with cupyx.allow_synchronize(True)
explicitly.
Notes
ndarray.get
may copy asynchronously if certain conditions are met (if using a non-default stream and given an explicitout
with pinned memory). If’s difficult for the code to predict whether it’s actually synchronous or not. Maybe it’s safe to separate asynchronous behavior as another method likendarray.get_async
.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Synchronization Explained - NI - National Instruments
However, if the same asynchronous input is fed to multiple devices, then they may detect the input change on different clock edges.
Read more >and Vehicle to Vehicle-Enabled” Cellular Networks: A survey
Physical Device-To-Device Synchronization Channel (PD2DSCH): The PD2DSCH is used to exchange some information associated with the synchronization or resource ...
Read more >View of Device Synchronization Using A Computerize Face ...
Presentation Mode Open Print Download Current View. Go to First Page Go to Last Page. Rotate Clockwise Rotate Counterclockwise. Text Selection Tool
Read more >Device Synchronization Using a Computerize Face Detection ...
Device Synchronization Using a Computerize Face Detection and Recognition System for Cyber security · Figures · Citations (2) · References (16).
Read more >Multi-Device Synchronization (MDS) - Zurich Instruments
Every application requiring multi-channel signal generation and/or detection may benefit from MDS, especially when multi-channel signals must be generated and/ ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think this is great and could be pretty useful to improve performance of user codes.
One minor detail that just comes to my mind is that instead of explicitly declare every method that synchronizes, could we make the cuda runtime call used to synchronize devices to check the flag set by the context manager?
This will solve the second issue, but it will not detect calls that “could” synchronize but they are not doing it due to certain runtime conditions. (I don’t know if your proposal is just to error on anything that may synchronize which is also legit).
Thank you for the measurement, but 6us looks quite expensive. In my environment it’s about 1us, but I’m still not confident that it’s cheap enough for an overhead of the lowest-level API call.