question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

When auto commit false commit offset not working proper way

See original GitHub issue

Environment

  • Node version : 8.16.2
  • Npm version : 6.4.1
  • Kafkajs version : 1.11.0

Bug Reproduce Code

  • Producer script
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
    clientId: 'my-app',
    brokers: ['aws.confluent.cloud:9092'],
    ssl: true,
    sasl: {
      mechanism: 'plain', // scram-sha-256 or scram-sha-512
      username: 'confluent_username',
      password: 'confluent_password'
    },
});
const topic_name = 'demo-topic';
const producer = kafka.producer();
const initialize_producer_connection = async ()=>{

    await producer.connect();

    await producer.send({
        topic: topic_name,
        messages: [
            { key: 'key1', value: 'hello world 1' },
            { key: 'key2', value: 'hello world 2' },
            { key: 'key3', value: 'hello world 3' },
            { key: 'key4', value: 'hello world 4' },
            { key: 'key5', value: 'hello world 5' },
            { key: 'key6', value: 'hello world 6' },
        ],
    });

};
initialize_producer_connection();
  • Consumer script
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
    clientId: 'my-app',
    brokers: ['aws.confluent.cloud:9092'],
    ssl: true,
    sasl: {
      mechanism: 'plain', // scram-sha-256 or scram-sha-512
      username: 'confluent_username',
      password: 'confluent_password'
    },
});
const topic_name = 'demo-topic';
const group_id = 'TESTING_GROUP'; 
const consumer = kafka.consumer({
    groupId: group_id,
});
const initialize_consumer_connection = async ()=>{

    await admin.connect();

    await consumer.connect();

    await consumer.subscribe({ 
        topic: topic_name, 
        fromBeginning: true 
    });
   
    await consumer.run({
        autoCommit:false,
        eachMessage: async ({ topic, partition, message }) => {
          
            const prefix = `${topic}[${partition} | ${message.offset}] / ${message.timestamp}`
            console.log(`- ${prefix} ${message.key}#${message.value}`);

            const commit_message = {
                topic,
                partition,
                offset: message.offset
            };
            
            consumer.commitOffsets([commit_message]);

        },
    });
 
};
initialize_consumer_connection();

Issue Description

  • When starting consumer script old message received which has already consume that message in past.
  • I try to manual commit to messaging so it auto-commit false in consumer-run.
  • Can I achieve manual commit of this way?

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:7
  • Comments:8

github_iconTop GitHub Comments

11reactions
rhyekcommented, Mar 19, 2020

For what it’s worth, I tested setting both offset and offset + 1 with process restarts and this is the correct way to do this:

await consumer.run({
  autoCommit: false,
  eachMessage: async ({ topic, partition, message }) => {
    ...
    await consumer.commitOffsets([{ topic, partition, offset: (Number(message.offset) + 1).toString() }]);
  },
});

with just await consumer.commitOffsets([{ topic, partition, offset: message.offset }]); it will re-process the last message after process restart.

If there are any drawbacks to this it would be great to hear from the devs.

7reactions
andycmajcommented, Mar 20, 2020

🤔 my first thought was that it was a bit counter-intuitive… if offset means, “the last message successfully processed”.

if offset means, “the message at which consumption will start when a new consumer starts listening”, then it makes more sense, but i don’t think that’s Kafka’s intended semantic.

The OffsetCommitRequest consists of a map that denotes the latest processed offset for any given partition (TopicPartition->OffsetAndMetadata).

Sample_Communication-e1556744869632

I believe this is a common confusion in kafka client libraries. see this python issue and this one.

seems like a mismatch between commit and consume semantics.

commit clearly (from confluent/kafka docs) should be dealing with the last processed message.

i believe CONSUMERS should start processing from the last committed message + 1 when they start up.

however there are other libraries that seem to confuse the issue:

Commit - Offset committed to permanent storage (broker, file). When consumer restarts this is where it will start consuming from. The committed offset should be last_message_offset+1.

it’s unclear to me whether they mean that the consumer will handle this or that the user should manually commit last_message_offset + 1.

again. personally, based on kafka docs, “The OffsetCommitRequest consists of a map that denotes the latest processed offset”, i believe that the “fetch next offset after last committed one” should be the responsibility of consumer join internals.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kafka Consumer with enable.auto.commit = false still ...
enable.auto.commit=false tells the kafka-clients not to commit offsets, but Spring will commit offsets by default.
Read more >
Kafka - enable.auto.commit = true/false examples
Kafka consumer will auto commit the offset of the last message received in response to its poll() call. KafkaConsumer#position() method. public ...
Read more >
Consuming Messages
By default, the consumer will commit the offset seeked. To disable this, set the autoCommit option to false on the consumer. consumer.run({ autoCommit:...
Read more >
Apache Kafka Offset Management - Learning Journal
The solution to this particular problem is a manual commit. So, we can configure the auto-commit off and manually commit after processing the...
Read more >
Kafka - When to commit?
If the poll method gets called again despite a failed processing, and auto-commit is still enabled, we may commit offsets while something wrong...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found