Streaming parsing of JSON array in Spring WebClient
See original GitHub issueI found #21862 which is pretty close to my request but closed.
I am currently using Spring WebClient with Spring Boot 2.2.6 and Spring Framework 5.2.5 writing a service that sits in front of a number of other upstream services and transforms their response for public consumption. Some of these services respond with very large JSON payloads that are little more than an array of entities wrapped in a JSON document, usually with no other properties:
{
"responseRoot": {
"entities": [
{ "id": "1" },
{ "id": "2" },
{ "id": "n" },
]
}
}
There could be many thousands of entities in this nested array and the entire payload can be tens of MBs. I want to be able to read in these entities through a Flux<T>
so that I can transform them individually and write them out to the client without having to deserialize all of them into memory. This doesn’t appear to be something that Spring WebFlux supports out of the box.
I’m currently exploring writing my own BodyExtractor which reuses some of the code in Jackson2Tokenizer
to try to support this. My plan is to accept a JsonPointer
to the location of the array and then parse asynchronously until I find that array, then to buffer the tokens for each array element to deserialize them.
var flux = client.get()
.uri(uri)
.exchange()
.flatMapMany(r ->
r.body(new StreamingBodyExtractor(JsonPointer.compile("/responseRoot/entities")))
);
Before I go too far down this path I was curious if this was functionality that Spring would be interested in supporting out of the box.
Similarly, I was curious about the functionality of being able to stream out a response from a WebFlux controller via a Flux<T>
where the streamed response would be wrapped in a JSON array and possibly in a root JSON document as well?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:7
- Comments:7 (5 by maintainers)
Top GitHub Comments
Thanks for taking a look! Here’s a newer Gist based on the code that we’re currently using in production.
Also, not to pile up additional requests in a single issue, but I didn’t see a way to use a
BodyExtractor
withretrieve()
which would force me to manually interpret the HTTP status error codes. Is there a reasonWebClient.ResponseSpec
doesn’t include a method that accepts aBodyExtractor
?