FeedRange - Scaling out support
See original GitHub issueWith the new Change Feed pull model approaching and the use of FeedRange, one pending item is support for scaling out running iterators.
Currently, the flow is:
- User gets FeedRanges
- User starts 1 iterator per FeedRange on each machine/instance
- Some time later, some of the machines are running hot on CPU because possibly the ranges they are handling are hotter than the other machines (or those ranges had splits)
How do we allow users to scale out if they want to?
Open to discussion
One idea would be to have a method on the FeedRange itself, possibly List<FeedRange> FeedRange.Scale(int? ranges)
that will attempt to split the range into the required # of ranges. If the parameter is not passed, we can either split using physical partition affinity or other limits.
Another idea would be to use it on the Continuation tokens. Since this is an scenario for when an iterator has been already running, we have a Continuation token. We could then take them as input and return new continuations after the split: List<string> FeedRange.Scale(string currentContinuationToken, int? expectedRanges)
.
Limits
Scale out operations can always face a limit. When EPK filtering is in place in the backend, a FeedRange cannot be split further than a single Partition Key value. Without EPK filtering in place, the limit is on the physical partition.
In any case, the API can always return a negative or “cannot scale out further” semantics.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:17 (11 by maintainers)
Top GitHub Comments
@ealsur @j82w Here is the new ticket: #2759
I believe we talked about this and voted to not go for a Scale method.
Instead you can have:
The Try Pattern will naturally imply to the user that the operation may fail. You can add monadic overloads to return a reason for the failure: