PartGenerator streaming mode mixing parts content, or part content not properly delivered when downstream part prefetch is 0
See original GitHub issueAffects: \5.2.x Affects: \5.3.x
When trying to use the PartGenerator
in steaming mode (streaming = true), the documentation states that part content should be consumed in an ordered and serial fashion. We still end up with a Flux<Part>
, but of course we cannot use the flatMap
operator to trigger part content consumption since it would break this requirement.
However, when using it, instead of raising an exception stating that parallel consumption of parts has been tried, the generator first silently mixes up the parts content (especially when very small parts are encountered, or when the part consumption is slower that the part reception/production), and a not explicit exception is finally thrown when trying to consume the ultimate or penultimate etc. (depending on how many parts have been silently skipped).
For example, if you have, say, a 20 KB part, then à 1 KB part and finally a 250 KB part, you will probably end up with the content of the 3rd part being sent as the 2nd part content, without notice, and when trying to consume the 3rd part, an exception will be thrown saying “Could not switch from STREAMING to STREAMING; current state: STREAMING”.
If you try to use the concatMap
operator, which is supposed to serialize mapper execution (and thus subscription to part content), the problems still appears because of the default prefetch of the operator (usually 32) : parts are still produced in advance, skipping small ones’ content. So if you want to really serialize part content consumption, you have to use concatMap
with no prefetch.
However, when doing so, you are faced with another problem (which is a bug) : no part’s body content (i.e. part content’s databuffer) are produced at all, even after subscription and unbounded request.
My analysis of the problem
This is due to the fact that the current implementation (through the requestToken()
method) does not account for the correct downstream request (given by sink.requestedFromDownstream()
) : only the downstream request from the parts’ sink is accounted for, whereas when delivering part content, the content sink should be considered and not the part sink. Thus, two distinct requestToken()
methods (or a parameterized requestToken(Sink sink)
) should be defined, to distinguish between part demand and content demand : for example, we could define the requestContent()
and requestPart()
methods, with conditions pointing to different sinks. And the requestContent()
method should probably be defined at the StreamingState
inner class level, in order to point to the correct sink.
Another problem is that a new intermediate state should be defined : currenty, there is only one state for part chaining and content delivery, the StreamingState
(that we could rename to ContentStreamingState
). But a transitional state, that we could call PartStreamingState
, should be defined to represent the fact that we are waiting for a new part to be requested from downstream, when all previous part content has been exhausted. The former would use requestContent()
and the latter requestPart()
.
Also, the part number, and/or a flag stating that the part is already passed or not, should be defined in the StreamingState
inner class, to be able to produce a correct error message when the serial consumption of parts and part contents is not honored by the downstream subscriber. Whenever a part content is subscribed or requested, the flag or part number should be checked to ensure the consumer is subscribing or requesting the correct part (the current one), an dnot a revious one : if the consumer requests or subscribes too late to a part that has already been streamed, an explicit error message, specifying the part numbers, should be raised, like “Trying to subscribe a part (number #) that has already been completed (current part: #)”. Also, the StreamingState
should display the part number in it’s toString()
method.
Finally, the ‘ContentStreamingState’ should also use an inner queue to deliver the content body, only when content is requested. Currently, when a data buffer comprises multiple parts, the parts content is delivered unconditionnally : it should be enqueued instead and delivered only when requested by part content subscriber.
Note. We could also consider defining a new StreamingPartGenerator
class to isolate the steaming behavior from PartGenerator
, since this is a quite different behavior in nature than the "normal’ one (saving parts to disc or memory before serving them).
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (5 by maintainers)
Here is the groovy code I used. Note that to consume properly you should use concatMap with a 0 prefetch StreamingPartGenerator.zip .
Closing this issue, because we have deprecated
DefaultPartHttpMessageReader
’s streaming mode in 6.0 RC1, in favor ofPartEvent
andPartEventHttpMessageReader
introduced through #28006. In short, the reason is that streaming mode put a lot of restrictions on its consumers, in terms of prefetch, but also other areas. See #28006 for more details, and also #29293.