Possibility to recreate new connection to the triton server
See original GitHub issueDescription
Hi, this probably does not fall into the bug category and i was not sure whether it may rather fit into feature request, Anyway here it is: we are using grpc streaming client tc::InferenceServerGrpcClient
(created with InferenceServerGrpcClient::Create()
function), the stream is started with StartStream(...)
and we are calling AsyncStreamInfer(..,)
to do the infer request. when client gets the results, the pointer is freed and everthing repeats.
i thought that when the client is deleted all the network connections are shutdown as well. however it is not the case.
I found out increasing number of TCP connections that goes to triton server, i don’t fully understand when they are created, as it is not after each call to Create()
function and after some time I saw error messages of type too many opened files
on the server. I was able to partially workaround that with reusing the stream, but it is not optimal because i needed to increase the stream timeout. so this rather seem as a bug to me.
however we would like to control when new tcp connections are created anyway, as our clients are behind a load balancer and we would like to create new connection when servers are going to be scaled down.
so my question is whether it is possible to somehow control when new tcp connections are created? i did not find any client api, so maybe with some grpc env variables?
Triton Information r22.02
Are you using the Triton container or did you build it yourself? triton container
To Reproduce see above
Expected behavior client user should be able to control creation of tcp connections
Issue Analytics
- State:
- Created a year ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
@JindrichD the best practices for GRPC for high load applications listed here includes reusing the TCP connections. I might be missing something here in my understanding but some questions:
Why do you need to create a new connection every time? The client actually defaults to reusing connections since closing old connections take more time. a. If you want a new connection every time you create a client, you can call
force_new_connection
to be true and the new connections will replace the old connection. However if you are reusing connections, you shouldn’t be getting thetoo many open files
message… b. Are you saying you are creating a new url every time you swap out clients? If you do, then it makes sense that there is atoo many opened files
message. However, if you are only doing load balancing, does the connections in downscaled server need new urls? c. Along the same thread of thought, you said you managed to fix your problem by reusing connections. Why is this a sub-optimal solution?To your point on the shared
std::map
that manages all the connections: we can create a flag to close the connection when each client is destructed… this is a feature the grpc client does not currently have.cc: @tanmayv25 if I’m missing something from the grpc streaming functionality
Closing issue due to lack of activitity. If you need further support, please let us know and we can reopen the issue.