What should I do when backend_input_collector return need_cuda_input_sync of True value?
See original GitHub issueI am developing a customized backend using c++. After calling the backend util function backend_input_collector.Finalized()
,I got the returned variable need_cuda_input_sync of value True. In the customed backend official examples,the code are all for backends running on cpu. As you can see in the demo code:
This customized backend is orchestrated after a model on gpu in an outer ensemble model and I need to fetch buffer from gpu. So what should I do when I received the need_cuda_input_sync variable of True value?
More to add: If I ignore this need_cuda_input_sync flag and continue the logic,when the server is concurrently called, the server will run into a core dump issue as below:
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @frankxyy ,
As the comment points out, you’ll need to synchronize the CUDA stream/event if
collector.Finalize()
returns true.Taking a look at some other backends that do this may be useful, such as ONNXRuntime: https://github.com/triton-inference-server/onnxruntime_backend/blob/main/src/onnxruntime.cc#L2034
CC @tanmayv25
Closing as this seems resolved, please open a new issue for any new questions.