[EventHub] How to force the EventProcessorClient to receive messages from every partition
See original GitHub issueQuery/Question I have an Event Hub with 2 partitions. The problem is that the EventProcessorClient when started is listening to only one partition. Is there an option to specify the partition to the client? Or maybe tell it to read messages from all partition? Or maybe some different approach could be used.
There’s a code of a sample program to consume data:
using System;
using System.Text;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using Azure.Messaging.EventHubs;
using Azure.Messaging.EventHubs.Consumer;
using Azure.Messaging.EventHubs.Processor;
namespace EventHubReader
{
class Program
{
static async Task Main()
{
var eventHubConnectionString = "event-hub-connection-string";
var storageConnectionString = "storage-connection-string";
var eventHubName = "eventhubname";
var storageContainerName = "storagecontainername";
var client = new EventProcessorClient(
new BlobContainerClient(storageConnectionString, storageContainerName),
EventHubConsumerClient.DefaultConsumerGroupName,
eventHubConnectionString,
eventHubName);
client.ProcessEventAsync += OnProcessEventAsync;
client.ProcessErrorAsync += OnProcessErrorAsync;
client.PartitionInitializingAsync += args =>
{
Console.WriteLine($"Initializing partition '{args.PartitionId}'");
return Task.CompletedTask;
};
await client.StartProcessingAsync();
Console.ReadKey();
}
private static Task OnProcessEventAsync(ProcessEventArgs arg)
{
if (!arg.HasEvent)
{
Console.WriteLine($"{nameof(OnProcessEventAsync)} was called without event");
return Task.CompletedTask;
}
arg.UpdateCheckpointAsync();
var message = Encoding.Default.GetString(arg.Data.Body.ToArray());
Console.WriteLine($"[Thread: {Environment.CurrentManagedThreadId}; Partition: {arg.Partition.PartitionId}] [{DateTimeOffset.Now}] Checkpoint updated for '{message}'");
return Task.CompletedTask;
}
private static Task OnProcessErrorAsync(ProcessErrorEventArgs arg)
{
Console.WriteLine($"[{DateTimeOffset.Now}] Error: {arg.Exception}");
return Task.CompletedTask;
}
}
}
As I’ve mentioned, it starts to listen to only one of two partitions. I’ve added the second EventProcessorClient and interestingly sometimes it worked well 😃 I have no idea how they negotiate not to listen to the same partition.
Environment:
- Packages: Azure.Messaging.EventHubs 5.2.0, Azure.Messaging.EventHubs.Processor 5.2.0
- OS: Windows 10 Pro Version 2004,
- Framework: .Net Framework 4.7.2
- IDE: Visual Studio Professional 2019 Version 16.7.2
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
Balance partition load across multiple instances - Azure ...
Describes how to balance partition load across multiple instances of your application using an event processor and the Azure Event Hubs SDK.
Read more >How to configure EventProcessorClient to read events only ...
The partition key property will be used to identify the partition within the Queue where the message must be stored when the session-id...
Read more >Azure Event Hubs client library for .NET - Microsoft .NET
To consume events for all partitions of an Event Hub, you'll create an EventProcessorClient for a specific consumer group. When an Event Hub...
Read more >Azure Event Hubs client library for Java
Many Event Hub operations take place within the scope of a specific partition. Any client can call getPartitionIds() or getEventHubProperties() to get the ......
Read more >Azure Event Hub Consumer Group with its Scenarios
Even after the Message Retention gets elapsed in EventHub, the partition events do not get cleared. Programmatically clearing up is also not ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’d be very, very careful when considering timestamps as any type of decision criteria; even a single machine can have issues with clock skew. In this case you’re describing at least two machines, which increases the risk. Adding more processors here would further increase that risk.
In this case, I’m thinking in terms of additional machines to host. The processor runs most of its work in the background using the .NET thread pool and uses a dedicated connection for each partition that it owns. If you’re network-bound, using more processors wouldn’t provide any benefit. If you’re CPU-bound, using more processors on the same machine would be likely to increase contention and cause more context switches. Without profiling, it’s not possible to say for sure, but I suspect that you would see lower throughput in that case.
@jsquire thank you for the very useful information that you’ve provided!