Responses arrives in different order than has been sent
See original GitHub issueDescription
We are using decoupled model, using grpc streaming api which means that clients do not expect responses for every request. On the server side we either mark the response as complete with TRITONBACKEND_ResponseFactorySendFlags(response_factory, TRITONSERVER_RESPONSE_COMPLETE_FINAL)
in case when we don’t need to send any data back, or we use TRITONBACKEND_ResponseSend() api function to send some data like this:
TRITONBACKEND_Response* response;
TRITONBACKEND_ResponseNewFromFactory(&response, response_factory);
//add some data to the response
TRITONBACKEND_ResponseSend(response, TRITONSERVER_RESPONSE_COMPLETE_FINAL, nullptr);
The TRITONBACKEND_ResponseSend functions are called from processing threads, but according to the documentation as long as we keep the response factory created from the original request we should be fine.
Now the problem what we see: we see that sometimes the order of the responses that are received by the client is different than the order of TRITONBACKEND_ResponseSend() function calls. I.e. to debug this we add a response counter into the response, so the responses were marked as 1,2,3,4,5 and so on. and also this info was printed in the server traces. However the client received the responses in wrong order like 1,2,3,5,4. For shorter message exchanges it was not so bad - for example just last and the one before last responses were switched, but longer ones were totally out of order and client received sequence something like 1,2,3,4,21,18,10,15,8,9…
We noticed this behavior under a higher load with more clients connected.
So this is maybe rather a question whether it is rather an expected behavior or a bug.
Triton Information 22.02
Are you using the Triton container or did you build it yourself? Triton container
To Reproduce See the description.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). we are using decoupled model
Expected behavior We expect that for the same client sequence of calling TRITONBACKEND_ResponseSend() would correspond to the received responses on the client - i.e. that when i send responses 1,2,3. client receives them in the same order.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
@JindrichD Can you elaborate on the client that you used? Are you using the
stream_infer
clients? I think they will preserve the order of the responses sent from the server to client.yea, ťhis is what we are using - well, we are using c++ clients, not the python one. but they should be equal, shouldn’t they. so we have the client:
tc::InferenceServerGrpcClient *client
, created with InferenceServerGrpcClient::Create() function), then we start a streamclient->StartStream(...)
and we are callingclient->AsyncStreamInfer(..,)
to do the infer request. We thought for long time that the the order of the responses is preserved, but it turned out that it is not.