Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot start more than 5 consumers sequentially

See original GitHub issue

Given the following code:

var Kafka = require('node-rdkafka');
var _ = require('underscore');
var P = require('bluebird');

P.each(_.range(10), idx => {
  console.log(`Connecting consumer ${idx}`);

  var consumer = new Kafka.KafkaConsumer({
    'metadata.broker.list': 'localhost:9092',
    'group.id': `test-group-${idx}`
  }, {});

  return new P((resolve, reject) => {
    consumer
      .on('ready', () => {
        console.log(`Consumer ${idx} ready`);

        consumer.subscribe(['TEST']);
        consumer.consume();

        resolve(consumer);
      })
      .on('error', err => {
        console.error('Consumer error: ' + err);
      });

    consumer.connect();
  });
});

I get the following output:

Connecting consumer 0
Consumer 0 ready
Connecting consumer 1
Consumer 1 ready
Connecting consumer 2
Consumer 2 ready
Connecting consumer 3
Consumer 3 ready
Connecting consumer 4

(then it hangs).

If the connection is done in parallel (remove return statement before new P), it works fine:

Connecting consumer 0
Connecting consumer 1
Connecting consumer 2
Connecting consumer 3
Connecting consumer 4
Connecting consumer 5
Connecting consumer 6
Connecting consumer 7
Connecting consumer 8
Connecting consumer 9
Consumer 3 ready
Consumer 2 ready
Consumer 0 ready
Consumer 1 ready
Consumer 4 ready
Consumer 5 ready
Consumer 6 ready
Consumer 7 ready
Consumer 8 ready
Consumer 9 ready

If it’s not a bug, then what am I doing wrong?

Kafka version 1.0.0.

Issue Analytics

State:
Created 6 years ago
Reactions:1
Comments:8 (2 by maintainers)

Top GitHub Comments

3reactions

ankoncommented, Mar 21, 2018

I think I’ve run into the same problem, and tried the same change of the variable, and had the same “doesn’t change things”.

From the debugging however I could see that we clearly fetched all the messages, they just took ages to arrive at my consumer. Profiles showed that the time was “somewhere” in “(idle)”/“syscalls”. So the theory above really made sense, and I checked: Turns out the variable name @webmakersteve suggested was almost right 😃

http://docs.libuv.org/en/latest/threadpool.html:

Its default size is 4, but it can be changed at startup time by setting the UV_THREADPOOL_SIZE environment variable to any value (the absolute maximum is 128).

Setting UV_THREADPOOL_SIZE to 8 vastly improved the performance for me.

3reactions

webmakerstevecommented, Mar 10, 2018

The reason this is happening is because consumers using the consume loop, i.e. using .consume() with no parameters, need to hold onto a thread in the libuv event loop. If you want to do this you need to increase the libuv threadpool size by setting process.env.UV_THREADPOOL to a number greater than 4.