Changing batch size between requests with shared memory fails
See original GitHub issueWhen running two clients with different batch sizes after each other, I get server errors on running inference (using shared memory) regarding the expected byte sizes.
Everything works well as long as batch sizes are consistent between clients or when the client is the very first client, but whenever a client has already set a batch size on the server before and the batch size differs for the currently requested one, I run into something similar to this:
Client error:
Server error message: expected buffer size to be 2945760bytes but gets 5891520 bytes in output tensor
Server error:
[trtserver.cc:1212] Infer failed: expected buffer size to be 2945760bytes but gets 5891520 bytes in output tensor
This happens when trying batch size 16, after another client successfully ran batch size 8, is done and unregistered his shared memory. So registering the shared memory with the new size seems to be okay, but running inference fails. All clients use the same model of course.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
Fixed the issue with GRPC failure when using different batch sizes with the aforementioned PR. Thank you for bringing it to our attention.
Thanks @philipp-schmidt. Just re-visited this. (Had earlier tested with HTTP and it worked fine) You were right. There was an issue on the GRPC server. Fixed in grpc_server.cc.
Summary of change here.