register_output_renderer() should support streaming data
See original GitHub issueI’d like to implement this by first extending the
register_output_renderer()
hook to support streaming huge responses, then switching CSV to use the plugin hook in addition to TSV using it.
_Originally posted by @simonw in https://github.com/simonw/datasette/issues/1096#issuecomment-732542285_
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:12 (12 by maintainers)
Top Results From Across the Web
What Is Streaming Data? | Amazon Web Services (AWS)
Streaming data is data that is generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in...
Read more >Spark Streaming Programming Guide
DStreams can be created either from input data streams from sources such as Kafka, and Kinesis, or by applying high-level operations on other...
Read more >Data Streaming: Benefits, Examples, and Use Cases - Confluent
Also known as stream data processing, data streaming is the continuous flow of ... By using stream processing technology, data streams can be...
Read more >Use the legacy streaming API | BigQuery - Google Cloud
The table must exist before you begin writing data to it unless you are using template tables. For more information on template tables,...
Read more >Oracle Cloud Infrastructure Streaming FAQ
The Oracle Cloud Infrastructure Streaming service provides a fully managed, scalable, ... streams of data that you can consume and process in near...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Idea: instead of returning a dictionary,
register_output_renderer
could return an object. The object could have the following properties:.extension
- the extension to use.can_render(...)
- says if it can render this.can_stream(...)
- says if streaming is supportedasync .stream_rows(rows_iterator, send)
- method that loops through all rows and usessend
to send them to the response in the correct formatI can then deprecate the existing
dict
return type for 1.0.Ha! That was your idea (and a good one).
But it’s probably worth measuring to see what overhead it adds. It did require both passing in the database and making the whole thing
async
.Just timing the queries themselves:
AsGeoJSON(geometry) as geometry
takes 10.235 msLooking at the network panel:
fetch
requestI’m not sure how best to time the GeoJSON generation, but it would be interesting to check. Maybe I’ll write a plugin to add query times to response headers.
The other thing to consider with async streaming is that it might be well-suited for a slower response. When I have to get the whole result and send a response in a fixed amount of time, I need the most efficient query possible. If I can hang onto a connection and get things one chunk at a time, maybe it’s ok if there’s some overhead.