Flow.chunked with specified time period
See original GitHub issueCurrently Flow supports only buffer operator with capacity.
It would be useful to buffer elements within specified time range.
flow.buffer(Duration.ofSeconds(5)).collect {...}
Issue Analytics
- State:
- Created 4 years ago
- Reactions:62
- Comments:35 (8 by maintainers)
Top Results From Across the Web
Kotlin - Chunk sequence based on size and time
This code start new time interval and does not send the previous (empty) chunk. Do we finish current chunk on timeout after last...
Read more >Solved: Power Automate 'uploadChunkSizeInMB' Error
Solved: I have a flow that creates a new file on a SFTP from a file in SharePoint. The Flow has been tested...
Read more >Chunked transfer encoding - Wikipedia
Chunked transfer encoding is a streaming data transfer mechanism available in Hypertext Transfer Protocol (HTTP) version 1.1, defined in RFC 9112 §7.1. ......
Read more >What is Time Chunking? [How it Improves Work Productivity]
Scientists have proven that 25-minute work periods are more effective in enabling people to give full focus to a specific task. Researchers also ......
Read more >Configuring a Step - Spring
ChunkListener. A “chunk” is defined as the items processed within the scope of a transaction. Committing a transaction, at each commit interval, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Half a year later… What about such a signature?
example usage:
flow.chunk(10)
- simple size chunkingflow.chunk(512) { delay(5.minutes) }
- accumulate values for 5 minutes with max size of 512flow.chunk(512, naturalBuffering = true)
- natural buffering variant with max sizeflow.chunk(512) { previousChunk -> if(previousChunk.size == 512) Unit else delay(5.seconds) }
- speed up if buffer is getting full before chunk emissionflow.chunk(512) { semaphoreChannel.receive() }
- emit chunk after signal from some external sourcePros:
Cons:
flow.chunk(maxSize = 512, interval = 5.minutes)
.Ok, I will have another go at this issue. @elizarov mentioned, that there is no obvious case for minSize and it complicates things. So, let`s assume, that minimum chunk size is always one. Seems to fit all use cases:
No minSize param then.
That leaves us with duration and size params:
fun Flow<T>.chunked(interval: Duration, size: Int): Flow<List<T>>
How to express our use cases with it…
Since our ‘size’ parameter is either maximumSize for time-based chunking or just desired size - it means we should try to emit immediately, when it is reached. Suspending upstream, if need be.
Interval
param is a little different. We cannot guarantee, that chunk will be emitted exactly after interval has passed. Since interval does not relate to size, we can, I think, safely assume, that it is ok to buffer subsequent elements after interval has passed, even if we cannot emit, due to busy downstream.In other words reaching size limit - we do suspend upstream until emission happens. Chunk cannot grow bigger, than specified. Reaching time limit - we do not suspend upstream no matter whether we did emit or are still waiting for downstream to get ready. Our time limit may be breached due to busy downstream. We cannot prevent it.
That shapes our design into `chunked(intervalConstraint OR sizeConstraint) consistently across all use cases.
So the proposal boils down to
fun Flow<T>.chunked(interval: Duration, size: Int): Flow<List<T>>
Proposed impl (give or take - no sanity checks, etc):
Helper functions:
Plus optimized, non-concurrent, impl for purely size-based chunking: