High cpu utilization of Promise.await and app degradation over time
See original GitHub issueIssue Description
Hello, this issue is a result of investigation of related problem in https://github.com/zio/zio-kafka/issues/238
For some reason during continuous recreation of Promises inner state of each new Promise(after old one was already completed) slowly growing and start to consumes too much cpu time as a results significantly affects performance of entire application.
The actual problem is in the next function and List of joiners, its starts from 0.1-2% of cpu time and goes up until you reset parent effect or app stops to respond(usually about up to 30%+ of cpu):

How it looks in real application
Reproducible scenario(takes at least 1-3 hours to see the effect), but the growing time it spends on inner joiners list is visible in cpu sampler from the beginning:
testM("stream should not go brrr") {
def newJob(queue: Queue[Promise[Option[Throwable], Chunk[Int]]]) = ZStream {
ZManaged.succeed {
for {
p <- Promise.make[Option[Throwable], Chunk[Int]]
_ <- queue.offer(p)
r <- p.await
} yield r
}
}
(for {
jobQueue <- Queue.unbounded[ZStream[Any, Throwable, Int]]
feedQueue <- Queue.unbounded[Promise[Option[Throwable], Chunk[Int]]]
//Actual stream of substreams
stream = ZStream
.fromQueue(jobQueue)
.mapMParUnordered(50) { s =>
//Just to affect performance faster
s.mapMPar(10)(Task(_))
.groupedWithin(1000, 1.millis)
.buffer(2)
.bufferSliding(1)
.runDrain
}
.runDrain
//Feed to fullfill promises
feed = ZStream
.fromQueue(feedQueue)
.mapMPar(10) { x =>
x.succeed(Chunk(1))
}
.runDrain
fb1 <- stream.fork
fb2 <- feed.fork
_ <- Chunk.fromIterable(0.to(10)).mapM(_ => jobQueue.offer(newJob(feedQueue)))
_ <- fb1.zip(fb2).join
} yield assert(true)(equalTo(true))).provideCustomLayer(zio.clock.Clock.live)
}
TLDR: We have stream of substreams that generates and fullfill promises one by one, its starts to work very slow over time and use as much cpu as possible, after we recreate stream, performance become normal and start degrade from the beginning
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:11 (9 by maintainers)
Oh sorry I thought we were in zio-kafka 😃 @jsfwa proposed a fix for this, but it’s a bit of a workaround and does not solve the actual underlying problem. @jdegoes was looking into this.
Did you mean to link to another issue? 😄 Or recursive-trolling me? 😝