question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

High memory consumption of the unmanaged memory of librdkafka

See original GitHub issue

Description

Hello, We are facing an issue when using the lib that may be blocking for us.

I’m in an application that starts 12 consumers consuming messages on different topics. These topics are filled with protobuf JSON messages.

So the 12 consumers are consuming at max speed and for the sake of this test, I do nothing in the consume, just logging a message to let me know that it consumed the message.

The consumer code is as follow:

// the method starting the consuming thread
public void Start()
   {
     _readerWriterLockSlim.EnterWriteLock();

     try
     {
       if (IsStarted)
       {
         _logger.SmartLogDebug("Kafka Consumer {0} @ {1} > Already started @ {2}", ConsumerId, Topic, BrokerList);
         return;
       }

       _logger.SmartLogDebug("Kafka Consumer {0} @ {1} > Starting consumer @ {2}", ConsumerId, Topic, BrokerList);

       _kafkaConsumer = CreateConsumer();
       _adminClient.WaitTopic(Topic, _logger);
       _kafkaConsumer.Subscribe(Topic);

       _cancellationTokenSource = new CancellationTokenSource();
       Task.Factory.StartNew(Consume, TaskCreationOptions.LongRunning);
       IsStarted = true;
     }
     finally
     {
       _readerWriterLockSlim.ExitWriteLock();
     }
   }

// the method creating the consumer
private IConsumer<string, TEvent> CreateConsumer()
   {
     var kafkaConsumer = new ConsumerBuilder<string, TEvent>(_consumerConfig)
       .SetErrorHandler(ErrorHandler(_consumerConfig.GroupId))
       .SetValueDeserializer(_deserializer)
       .SetLogHandler(LogHandler)
       .SetKeyDeserializer(new StringDeserializer(Encoding.UTF8))
       .SetPartitionsAssignedHandler(PartitionsAssignedHandler(_consumerConfig.GroupId))
       .SetPartitionsRevokedHandler(PartitionsRevokedHandler(_consumerConfig.GroupId))
       .Build();

     return kafkaConsumer;
   }

// the method consuming the messages
public void Consume()
   {
     while (!_cancellationTokenSource.IsCancellationRequested)
     {
         var consumeResult = _kafkaConsumer.Consume(_cancellationTokenSource.Token);
         _logger.LogInformation("Event consumed!");
     }
   }

So we are 12 consumers looking like that consuming events at the max speed they can and the memory allocated to the .NET process is more than 1GB and in this the most part is the unmanaged memory used by librdkafka. image

This memory seems to be “hold” by the SafeKafkaHandle class of this project. image

image

How to reproduce

Create an app that start more than 10 consumers at the same time then consumes events at max speed and check the memory.

So my question is : Is this memory consumption normal for you? If so, can we mitigate that by doing something?

Checklist

Please provide the following information:

  • A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.
  • Confluent.Kafka nuget version. tried again 1.7.0, 1.6.3, 1.4.4, 1.5.3, always the same result.
  • Apache Kafka version. 2.2.0
  • Client configuration.
  • Operating system. Windows 10 64 bits. (but Kafka is running inside a docker container)
  • Provide logs (with “debug” : “…” as necessary in configuration).
  • Provide broker log excerpts.
  • Critical issue.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
edenhillcommented, Jul 9, 2021

Since >=v1.5.0 a consumer will pre-fetch and buffer up to roughly 64MB of messages (plus some overhead), so that could account for slightly more than half of the memory usage you’re seeing. You could try reducing this by setting QueuedMaxMessageKbytes to something like 10MB and see if there’s a difference in memory usage.

1reaction
edenhillcommented, Jul 9, 2021

Generally I’d advice to use fewer consumers, if possible. E.g., have fewer consumers consume the same set of topics and partitions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory usage of librdkafka · Issue #3343
No, an idle librdkafka instance should not consume much memory, guessing less than a meg. Is this a producer instance?
Read more >
c# - Kafka consumer local batch queue memory leak
My service works with a topic through 10 consumers. Unmanaged memory grows immediately after adding a new batch of messages to a cluster...
Read more >
Dawn of the Dead Ends: Fixing a Memory Leak in Apache ...
The issue we're debugging is that somewhere, Kafka is allocating memory and never freeing it. Remember, this isn't on-heap memory or off-heap  ......
Read more >
Understanding memory consumption - Knowledge Base
Why is Neo4j consuming more memory than you allocated? Is this a memory leak or normal behaviour? So many questions!
Read more >
Tackling unmanaged memory with Dask
Shed light on the common error message “Memory use is high but worker has no data to store to disk. Perhaps some other...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found