question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

combine(Iterable<Flow>) is very slow

See original GitHub issue

The combine(…) function becomes very slow the depending on the amount of Flows.

Flat-mapping a 40k List<List<…>> takes approx. 11ms. With one element per list. Flat-mapping a 40k List<Flow<…>> takes approx. 1 minute. With one element per Flow. And that’s with a very very simple non-suspending test case.

If combine is expected to have such a low performance it should be documented, otherwise developers like me treat is as a simple Flow-version of List<List<…>>.flatten(). If it’s not expected then there is a bug.

My use case

I have ~40k objects in memory. For each object there is a Flow and a coroutine that periodically refreshes the object from the server when it’s expired and emits that. At some point I have to continuously merge the latest version of all 40k objects into a single list for further processing. For that I use combine(listOfObjectFlows) { it.toList() }.

Unfortunately that takes somewhere north of 15 minutes already. I have no time to let it finish to see the total…

I’ve written my own implementation now as a workaround.

Test code

import kotlin.time.*
import kotlinx.coroutines.flow.*

@OptIn(ExperimentalTime::class)
suspend fun main() {
    val flowList = (1..40_000).map { flowOf(it) }
    val start = TimeSource.Monotonic.markNow()
    val listFlow = combine(flowList) { it.toList() }

    listFlow.collect {
        println("Took: ${start.elapsedNow()}") // Took: 66.5s
    }
}

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

6reactions
qwwdfsadcommented, Oct 20, 2020

Even with a 40% gain the timing would still be a different ballpark than what I expect (minutes instead of seconds).

The problem was pretty simple – accidental and well-hidden O(N^2) asymptotic where N is the number of flows. I’ve fixed it, it will be available in 1.4.0 and will be even faster (by a significant margin) than proposed implementation. In general, for a large number of flows, combine became faster by orders of magnitude and for two flows (“basic case”) it became precisely two times faster.

Thanks for pointing it out!

2reactions
qwwdfsadcommented, Oct 12, 2020

Yes, with suspend fun main there is no dispatcher, so multithreaded Dispatchers.Default is used and adds non-determinism. I was implying deterministic case when combine is called from within a runBlocking or Dispatchers.Main

Read more comments on GitHub >

github_iconTop Results From Across the Web

Combine multiple Kotlin flows in a list without waiting for a first ...
All flows are merged concurrently, without limit on the number of simultaneously collected flows. The default .merge() implementation works like ...
Read more >
Asynchronous Flow - Kotlin
Another flattening operation is to concurrently collect all the incoming flows and merge their values into a single flow so that values are...
Read more >
Learn advanced coroutines with Kotlin Flow and LiveData
Add logic within a LiveData builder. Use Flow for asynchronous operations. Combine Flows and transform multiple asynchronous sources.
Read more >
Insert flow into I$ table is FAST, but Merge Rows is SLOW
Hi, I have a really simple mapping with source table (2.4 million rows), 6 joined tables on it (most of tables have less...
Read more >
Kotlin combine flows - Reddit
I'm working on a project where I need to emit multiple Flows and use combine to transform then into 1, however combine had...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found