[QUESTION] grpc_server.cpp ModelInferHandler::Process the same Request instance for different correlation_id
See original GitHub issueHi!
In file grpc_server.cpp
at method ModelInferHandler::Process
there is code near } else if (state->step_ == Steps::READ) {
I added debug output:
TRITONSERVER_Error* err = nullptr;
const inference::ModelInferRequest& request = state->request_;
LOG_VERBOSE(1) << "Process READ for " << Name() << ", state_id "
<< state->unique_id_ << ", request_addr " << std::addressof(state->request_) << ", request " << request.id();
and I see in out such strings:
Dec 03 13:42:58 ub2004.private tritonserver[300604]: I1203 13:42:58.092498 300604 grpc_server.cc:3532] New request handler for ModelStreamInferHandler, state_id 3, request_addr 0x7fcd14001920
Dec 03 13:42:58 ub2004.private tritonserver[300604]: I1203 13:42:58.102018 300604 grpc_server.cc:3538] Process READ for ModelStreamInferHandler, state_id 3, request_addr 0x7fcd14001920, request 1_0_1_0
Dec 03 13:42:58 ub2004.private tritonserver[300604]: I1203 13:42:58.217927 300604 grpc_server.cc:3855] Process for ModelStreamInferHandler, rpc_ok=1, context 2, 3 step WRITEREADY is NOT complete. , request_addr 0x7fcd14001920, request 1_0_1_0
Dec 03 13:42:58 ub2004.private tritonserver[300604]: I1203 13:42:58.217935 300604 grpc_server.cc:3866] Process for ModelStreamInferHandler, rpc_ok=1, context 2, 3 step WRITEREADY set state to outgoing write. , request_addr 0x7fcd14001920, request 1_0_1_0
Dec 03 13:42:58 ub2004.private tritonserver[300604]: I1203 13:42:58.596181 300604 grpc_server.cc:3538] Process READ for ModelStreamInferHandler, state_id 385, request_addr 0x7fcd14001920, request 172_1671_0_0
Explanaition:
1_0_1_0
means corr_id=1, request_sequtial_id(autoincrement)=0, START=1, END=0
172_1671_0_0
means corr_id=172, request_sequtial_id(autoincrement)=167, START=1, END=0
request_addr
is std::addressof(state->request_)
So, how is it possible, that for different utterances in one batch there is one state->request_
, that is used?
Issue Analytics
- State:
- Created 2 years ago
- Comments:19 (8 by maintainers)
Top GitHub Comments
New issue(
BUG
) opened: https://github.com/triton-inference-server/server/issues/3694This is a part of the optimization coming from protobuf message memory reuse. The
request_
is part ofInferHandlerState
and the state objects are kept in a bucket. See here for the logic how during StateNew, the previously created state objects are reused instead of creating new object.