Kafka: allow message key and partition to be chosen independently
See original GitHub issueIf a PartitionKeyStrategy
is used with a topic, the value is used as the message key, and is then implicitly used to select the partition according to the default behavior of the Kafka client:
If a valid partition number is specified that partition will be used when sending the record. If no partition is specified but a key is present a partition will be chosen using a hash of the key. If neither key nor partition is present a partition will be assigned in a round-robin fashion.
It might be desirable in some cases to control these independently. For example, you might wish to have a message key that is more fine-grained than the partition key, for use with Kafka log compaction on sub-graphs of the entity state.
Issue Analytics
- State:
- Created 6 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Kafka Partitioning and Message Key - Silverback
While using a single poll loop, Silverback processes the messages consumed from each Kafka partition independently and concurrently. By default up to 10 ......
Read more >Documentation - Apache Kafka
Each partition is an ordered, immutable sequence of messages that is continually appended to—a commit log. The messages in the partitions are each...
Read more >What should I use as the key for my Kafka message?
In Kafka, the messages are guaranteed to be processed in order only if they share the same key (and you use the default...
Read more >Chapter 4. Kafka Consumers: Reading Data from Kafka
Moving partition ownership from one consumer to another is called a rebalance. Rebalances are important because they provide the consumer group with high ......
Read more >Understanding Kafka Topics and Partitions - Stack Overflow
Messages in the partition have a sequential id number that uniquely identifies each message within the partition. Partitions allow a topic's log ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ignasi35 Yes, my step 1 was indeed derived from @jroper suggestion. But in my step 2, I meant actually using the new property in the lagom kafka-broker internal. Anyway, I think I am on the right line.
I have looked at the details and the implementation seems quite straight forward. But then, I was just a bit puzzled with the inter-dependence between a message key and a partition number. I am not a Kafka expert at all but from what I can see in their Producer API, we can have the following cases:
but we can’t have a partition number without a key.
So, from that, it doesn’t seem good to have two distinct properties like
PartitionKeyStrategy
andPartitionNumberStrategy
because then if the user defines the second without the first we have a problem. A solution I see would be to have a more general strategy encompassing both key and partition generation. e.g.:Hence, it would be possible to define the message key alone, or the key and the partition.
Of course, this would mean more changes because this new type would replace the previous
PartitionKeyStrategy
(although they could also live together for a while, by giving precedence to the first)Does such an approach seem suitable ? Do you have another idea to cope with this link between message key and partition ?
The approach we’ve taken to adding properties to a topic mean that this should be straight forward to add without impacting existing APIs and functionality.