question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Adding timeout options for both client and server with a custom backend and stateful batching

See original GitHub issue

I’m interested in adding some good timeout behaviour for a deployment where we use a custom backend for triton that uses a stateful batcher. I’m using the stateful backend as a reference.

  • When timing out on the client side, I can see we use the stream_timeout option (as is done here) but it’s not clear how this interacts with the server. If I terminate the stream via the timeout on the client side, does that cause any code to run on the server side?
  • In the stateful batcher example, it looks like there’s a separate timer for each stream that evicts the stream from the server when it times out. Similar to before, does this interact with the client in some way? Can we send an error message to the client when this happens.

In general any advice on managing clients and their states in a robust way would be appreciated, we’d really like to avoid any instance where a client is taking up a slot on the server but has already timed out (or the client is waiting for a response from the server that has already timed out its slot).

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Tabriziancommented, Jul 14, 2022

When timing out on the client side, I can see we use the stream_timeout option (as is done here) but it’s not clear how this interacts with the server. If I terminate the stream via the timeout on the client side, does that cause any code to run on the server side?

I think it will terminate the GRPC stream and will close the connection to the server side. (CC @tanmayv25).

In the stateful batcher example, it looks like there’s a separate timer for each stream that evicts the stream from the server when it times out. Similar to before, does this interact with the client in some way? Can we send an error message to the client when this happens.

I think this is mainly for deleting the storage associated with the correlation IDs stored in the backend. I don’t think it does interact with the client. This is mainly for removing the storage when the max_sequence_idle_microseconds has elapsed.

You might also be interested in the implicit state management API for the backends: https://github.com/triton-inference-server/core/blob/main/include/triton/core/tritonbackend.h#L689-L758

Currently, only TensorRT and ONNX backends implement this API but you can incorporate this into your own custom backends too. Using implicit state management, the state tensors will be internally handled by Triton core and you don’t need store them in your backend.

0reactions
rakib-hasancommented, Jul 19, 2022

Thanks, Iman. What you said about the stateful backend is correct. The internal timer is there only to cleanup the states for timed-out sequences.

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Complete Guide to Timeouts in Node.js - Better Stack
Assigning timeout values prevents network operations in Node.js from blocking indefinitely. This article provides extensive instruction on how to time out ...
Read more >
AWS Solutions Architect Associate Exam Questions for FREE
Here we've a list of free AWS Solutions Architect Exam Questions and Answers for you to prepare well for the AWS Solution Architect...
Read more >
HAProxy version 2.4.15 - Configuration Manual - GitHub Pages
Simple configuration for an HTTP proxy listening on port 80 on all # interfaces and forwarding requests to a single backend "servers" with...
Read more >
T45888 Batch Parsoid's API requests - Wikimedia Phabricator
Adding custom batching support just for template expansion and extension tag calls would really ... and would add complexity to both clients and...
Read more >
Documentation - Apache Kafka
The new Java Consumer now supports heartbeating from a background thread. There is a new configuration max.poll.interval.ms which controls the maximum time ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found