Coroutine commands that result in a `Flow` can hang
See original GitHub issueBug Report
We observed that when executing an mget()
with a large set of keys against an empty Redis DB, the call would periodically stall. Further, we observed that this only occurred when the size of the requested set of keys was some multiple of Channel.CHANNEL_DEFAULT_CAPACITY
(64). This occurs with some degree of randomness, but the attached test can usually trigger it within 15 - 50 calls to mget
.
Current Behavior:
The following was collected via kotlinx-coroutines-debug
, once a stalled request occurred:
Coroutine DeferredCoroutine{Active}@7dae0690, state: SUSPENDED
at kotlinx.coroutines.reactive.PublisherAsFlow.collectImpl(ReactiveFlow.kt:97)
at kotlinx.coroutines.flow.FlowKt__CollectionKt.toCollection(Collection.kt:32)
....application stack
These are permanently SUSPENDED
, and never resume.
Input Code
Code Sample
import io.lettuce.core.ExperimentalLettuceCoroutinesApi
import io.lettuce.core.RedisClient
import io.lettuce.core.RedisURI
import io.lettuce.core.api.coroutines
import kotlinx.coroutines.flow.mapNotNull
import kotlinx.coroutines.flow.toList
import kotlinx.coroutines.runBlocking
import java.util.concurrent.atomic.AtomicInteger
@OptIn(ExperimentalLettuceCoroutinesApi::class)
fun main() {
val i = AtomicInteger()
val client = RedisClient.create()
val connection = RedisURI.builder().apply {
withHost("localhost")
withPort(6379)
}.build().let {
client.connect(it)
}.coroutines()
// The usage of 128 here is meaningful, as mentioned above this is only
// observed with multiples of 64 in the size of the requested set of keys.
val array = Array(128) { "111" }
runBlocking {
repeat(1000) {
connection.mget(
*array
).mapNotNull { result ->
if (result.hasValue()) {
result.value as String
} else null
}.toList()
println(i.getAndIncrement())
}
}
Thread.sleep(1000)
println("finish")
}
Observed Behavior
The run will generally stall before 50 iterations are complete. A temporary fix appears to be applying an UNLIMITED
buffer:
connection.mget(*array).buffer(Channel.UNLIMITED).mapNotNull { result ->
The default behavior is Channel.BUFFERED
, which triggers the CHANNEL_DEFAULT_CAPACITY
buffering in Kotlin’s PublisherAsFlow
Reactive adapter.
Expected Behavior
mget()
should not permanently SUSPEND, and either return results or an error when the command completes.
Environment
- Lettuce: Originally observed on
io.lettuce:lettuce-core:6.0.2.RELEASE
, we’ve also reproduced on6.1.4.RELEASE
. - Redis: AWS Elasticache Redis 6.0.5, the above demo script was run against a local Redis 4 via Docker.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Thanks. That’s what I meant. I’m not sure how/whether Kotlin Coroutines optimize if results aren’t consumed. However, fully consuming a result and running the same code in cycles should not cause a hanging application. I need to check what’s happening.
I had a look and it seems a bug in
RedisPublisher
where a particular code path wasn’t safe when changing from no demand to demand.