question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Critical] All Channel implementations leak every passed object

See original GitHub issue

Last two days I tried to avoid huge memory leak in kotlinx.coroutines.experimental.channels.Channel. All I found is that OutOfMemoryError is unavoidable at this moment (correct me if I’m wrong, please). Every Channel implementations leak every object passed through it. This makes them impossible for real use.

Example to quickly reproduce memory leak (run it with -Xmx256m, for example):

import kotlinx.coroutines.experimental.*
import kotlinx.coroutines.experimental.channels.Channel
import kotlinx.coroutines.experimental.channels.consume
import java.lang.management.ManagementFactory
import java.util.concurrent.CountDownLatch
import java.util.concurrent.atomic.AtomicInteger

fun main(args: Array<String>) {
	@Suppress("JoinDeclarationAndAssignment")
	val channel: Channel<ArrayList<Double>>
	channel = Channel(Channel.UNLIMITED)
//	channel = Channel(0)
//	channel = Channel(16)
	launch {
		val I1 = 50_000_000
		val I2 = 5_000
		val countDownLatch = CountDownLatch(I1)
		for (i in 0 until I1) {
			async(DefaultDispatcher) {
				val numbers = ArrayList<Double>()
				for (k in 1..I2)
					numbers.add(k * 1.0)
				channel.send(numbers)
				countDownLatch.countDown()
			}
		}
		countDownLatch.await()
		channel.close()
	}
	runBlocking {
		val counter = AtomicInteger()
		channel.consume {
			for (element in this) {
				val mxBean = ManagementFactory.getMemoryMXBean()
				val usedMemory = mxBean.heapMemoryUsage.used / 1024f / 1024f
				println("${counter.getAndIncrement()}: $usedMemory MB used")
			}
		}
	}
}

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
qwwdfsadcommented, Apr 12, 2018

Please provide rx sample, you’re probably using operators with backpressure mechanism.

I’ve checked, OOM with unlimited LinkedBlockingQueue is reproducible as well. Thread.sleep on producing side won’t help because you have one thread which creates 50_000_000 of objects (coroutines) without any throttling, which then are calling sleep after their creation. 50_000_000 of coroutines are enough to exhaust a 256m JVM.

If you’ll use Thread.sleep before creating a coroutine (right before async(DefaultDispatcher)) everything will work (though, pretty slowly).

What you need is backpressure mechanism. You can use limited size channel for this purpose as well, where capacity will be the amount of pending (not currently executing) tasks.

E.g. you you’ll rewrite your sample this way:

// Maximum 32 unprocessed results
val listChannel: Channel<ArrayList<Double>> = Channel(32)
// Maximum 32 created `async` tasks
val semaphoreChannel = Channel<Unit>(32)

launch {
    val I1 = 50_000_000
    val I2 = 5_000
    val countDownLatch = CountDownLatch(I1)
    for (i in 0 until I1) {
        // Acquire permit to launch new task
        semaphoreChannel.send(Unit)
        async(DefaultDispatcher) {
            val numbers = ArrayList<Double>()
            for (k in 1..I2)
                numbers.add(k * 1.0)
            listChannel.send(numbers)
            countDownLatch.countDown()
            semaphoreChannel.receive() // release permit
        }
    }
    countDownLatch.await()
    listChannel.close()
}
runBlocking {
    val counter = AtomicInteger()
    listChannel.consume {
        for (element in this) {
            val mxBean = ManagementFactory.getMemoryMXBean()
            val usedMemory = mxBean.heapMemoryUsage.used / 1024f / 1024f
            println("${counter.getAndIncrement()}: $usedMemory MB used")
        }
    }
}

On my machine, it never exceeds 150 mb to process ~2kk of tasks.

But in general this approach is not intuitive, what will be really helpful is working pool pattern https://github.com/Kotlin/kotlinx.coroutines/issues/172, so your example could be hypothetically rewritten as something like

produce<Int> { 
  for (i in 0 until I1) {
    send(i) 
  }
}.map(parallelism = 8) {
    val numbers = ArrayList<Double>()
     for (k in 1..I2)
        numbers.add(k * 1.0)
     send(numbers)
}.map(parallelism = 1) {
    val mxBean = ManagementFactory.getMemoryMXBean()
    val usedMemory = mxBean.heapMemoryUsage.used / 1024f / 1024f
    println("${counter.getAndIncrement()}: $usedMemory MB used")
}
0reactions
amalcommented, Apr 12, 2018

@qwwdfsad yeah seems that better backpressure control can reduce memory usage tremendously.

And also I left too many simultaneous tasks for this reproduce example and OOM was definitely from the number of created coroutines, not getting to OOM from the sent data. My bad.

Sorry for bothering with a false alarm. What I didn’t understand is that even with limited RendezvousChannel and ArrayChannel sender suspends when the channel is full, not blocked. Seems what I saw in the dumped heap were this suspended items. Of course, these objects have references to the sent data 😦

With the working pool pattern, it would be much more usable and intuitive.

Read more comments on GitHub >

github_iconTop Results From Across the Web

LeakProf: Featherlight In-Production Goroutine Leak Detection
Channel Iteration Misuse​​ The leak is caused whenever a channel is iterated over, but the channel is never closed. This causes the for...
Read more >
Chapter 4, Concurrency Patterns in Go - O'Reilly
Here we recursively create an or-channel from all the channels in our slice after the third index, and then select from this. This...
Read more >
How Unsecure gRPC Implementations Can Compromise APIs
This blog discusses the security pitfalls that developers might face when shifting to gRPC and implementing gRPC in their projects.
Read more >
Side-Channel Leaks in Web Applications: a Reality Today, a ...
a side-channel information leak is a realistic and serious threat to user privacy. ... resource objects (e.g., images) of different sizes, the attacker....
Read more >
Leak Detection Methods: - Thomasnet
Each test method is suitable only for a specific leak rate or ... Critical leak spots in closed systems are usually connections, gaskets,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found