Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for Streams in Responses

See original GitHub issue

APIs that download binary data currently must be done by type: string, format: binary. This translates to byte arrays (in Java for example, anyway that’s what swagger-ui and swagger-codegen do). But this consumes large amounts of memory if the data is very big, and easily gives out-of-memory errors.

Typically huge blocks of data are sent via streams instead of byte arrays in most remote APIs. This allows for loading only a block of data at a time in memory.

The workaround used to be the ‘file’ format but this was removed in version 3.

What I propose is a new type: stream (format: binary) which would allow this. As an example, this could map to Spring’s ResponseEntity(inputStreamResource,...) on the server side, and a okhttp’s ResponseBody.byteStream() on the client side. Lots of HTTP client & server frameworks support streams in API implementations so I’m surprised it doesn’t already exist.

Issue Analytics

State:
Created 5 years ago
Reactions:25
Comments:23 (13 by maintainers)

Top GitHub Comments

9reactions

chirinocommented, Mar 10, 2021

Another example of an API that sends a never ending stream of json objects (like the twitter api) is when you you watch Kubernetes resources. More details can be found at: https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes

This seems common enough that it should have a simple openapi way to describe it.

8reactions

ePaulcommented, Jan 20, 2021

I think we have two issues here:

If there is one JSON document which just won’t ever be finished (i.e. new elements to an array, or new key-value pairs in an object are being added), the application/json content type applies and the usual JSON schema works, but it would be nice for a client-generator to know that a streaming JSON parser (and an API which provides access before parsing is done) needs to be used.
I we have a series of JSON documents, in whatever framing format (line-delimited seems to be common), then describing it as application/json is wrong. And OpenAPI doesn’t really provide a way to describe that it’s a stream of X (where X matches a schema).

As an example, Nakadi’s API (Nakadi is my Company’s internal event bus) just describes this behavior in the description field of the response (so in the generated documentation you don’t even see the content schema).

For this case, having something like type: stream analogously to type: array seems useful. A client could then parse the stream elements one-by-one using the schema, and pass them to the application as they come.