question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Streaming parsing of JSON array in Spring WebClient

See original GitHub issue

I found #21862 which is pretty close to my request but closed.

I am currently using Spring WebClient with Spring Boot 2.2.6 and Spring Framework 5.2.5 writing a service that sits in front of a number of other upstream services and transforms their response for public consumption. Some of these services respond with very large JSON payloads that are little more than an array of entities wrapped in a JSON document, usually with no other properties:

{
    "responseRoot": {
        "entities": [
            { "id": "1" },
            { "id": "2" },
            { "id": "n" },
        ]
    }
}

There could be many thousands of entities in this nested array and the entire payload can be tens of MBs. I want to be able to read in these entities through a Flux<T> so that I can transform them individually and write them out to the client without having to deserialize all of them into memory. This doesn’t appear to be something that Spring WebFlux supports out of the box.

I’m currently exploring writing my own BodyExtractor which reuses some of the code in Jackson2Tokenizer to try to support this. My plan is to accept a JsonPointer to the location of the array and then parse asynchronously until I find that array, then to buffer the tokens for each array element to deserialize them.

var flux = client.get()
    .uri(uri)
    .exchange()
    .flatMapMany(r ->
        r.body(new StreamingBodyExtractor(JsonPointer.compile("/responseRoot/entities")))
    );

Before I go too far down this path I was curious if this was functionality that Spring would be interested in supporting out of the box.

Similarly, I was curious about the functionality of being able to stream out a response from a WebFlux controller via a Flux<T> where the streamed response would be wrapped in a JSON array and possibly in a root JSON document as well?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:7
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
HaloFourcommented, Oct 5, 2020

Thanks for taking a look! Here’s a newer Gist based on the code that we’re currently using in production.

1reaction
HaloFourcommented, Apr 23, 2020

Also, not to pile up additional requests in a single issue, but I didn’t see a way to use a BodyExtractor with retrieve() which would force me to manually interpret the HTTP status error codes. Is there a reason WebClient.ResponseSpec doesn’t include a method that accepts a BodyExtractor?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Spring WebClient: Parse + Stream very large JSON
the biggest problem I have is, that I DO care about the content, and I want to parse the array entries, but not...
Read more >
Get List of JSON Objects with WebClient - Baeldung
In this article, we'll find out how to convert a JSON Array into a Java Array of Object, Array of POJO, and a...
Read more >
How to Read JSON Data with Spring WebClient - amitph
Later, we will study how to use WebClient to read a list of JSON objects and parse them into an array of POJO...
Read more >
3 techniques to stream JSON in Spring WebFlux
Returning one large JSON array as individual document · Server-sent events pushing individual items as events · Streaming individual events ...
Read more >
Using Reactive WebClient with Spring WebFlux
Let's think about what happened here. Our reactive stream on the server side has been returned as a JSON array response to the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found