question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[EventHub] How to force the EventProcessorClient to receive messages from every partition

See original GitHub issue

Query/Question I have an Event Hub with 2 partitions. The problem is that the EventProcessorClient when started is listening to only one partition. Is there an option to specify the partition to the client? Or maybe tell it to read messages from all partition? Or maybe some different approach could be used.

There’s a code of a sample program to consume data:

using System;
using System.Text;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using Azure.Messaging.EventHubs;
using Azure.Messaging.EventHubs.Consumer;
using Azure.Messaging.EventHubs.Processor;

namespace EventHubReader
{
    class Program
    {
        static async Task Main()
        {
            var eventHubConnectionString = "event-hub-connection-string";
            var storageConnectionString = "storage-connection-string";

            var eventHubName = "eventhubname";
            var storageContainerName = "storagecontainername";

            var client = new EventProcessorClient(
                new BlobContainerClient(storageConnectionString, storageContainerName),
                EventHubConsumerClient.DefaultConsumerGroupName,
                eventHubConnectionString,
                eventHubName);

            client.ProcessEventAsync += OnProcessEventAsync;
            client.ProcessErrorAsync += OnProcessErrorAsync;
            client.PartitionInitializingAsync += args =>
            {
                Console.WriteLine($"Initializing partition '{args.PartitionId}'");
                return Task.CompletedTask;
            };

            await client.StartProcessingAsync();
            Console.ReadKey();
        }

        private static Task OnProcessEventAsync(ProcessEventArgs arg)
        {
            if (!arg.HasEvent)
            {
                Console.WriteLine($"{nameof(OnProcessEventAsync)} was called without event");
                return Task.CompletedTask;
            }

            arg.UpdateCheckpointAsync();
            var message = Encoding.Default.GetString(arg.Data.Body.ToArray());
            Console.WriteLine($"[Thread: {Environment.CurrentManagedThreadId}; Partition: {arg.Partition.PartitionId}] [{DateTimeOffset.Now}] Checkpoint updated for '{message}'");
            return Task.CompletedTask;
        }

        private static Task OnProcessErrorAsync(ProcessErrorEventArgs arg)
        {
            Console.WriteLine($"[{DateTimeOffset.Now}] Error: {arg.Exception}");
            return Task.CompletedTask;
        }
    }
}

As I’ve mentioned, it starts to listen to only one of two partitions. I’ve added the second EventProcessorClient and interestingly sometimes it worked well 😃 I have no idea how they negotiate not to listen to the same partition.

Environment:

  • Packages: Azure.Messaging.EventHubs 5.2.0, Azure.Messaging.EventHubs.Processor 5.2.0
  • OS: Windows 10 Pro Version 2004,
  • Framework: .Net Framework 4.7.2
  • IDE: Visual Studio Professional 2019 Version 16.7.2

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jsquirecommented, Oct 12, 2020

I’d be very, very careful when considering timestamps as any type of decision criteria; even a single machine can have issues with clock skew. In this case you’re describing at least two machines, which increases the risk. Adding more processors here would further increase that risk.

but it would likely be a trade-off with additional cost for hosting

Is that because of several threads processing concurrently processing the data?

In this case, I’m thinking in terms of additional machines to host. The processor runs most of its work in the background using the .NET thread pool and uses a dedicated connection for each partition that it owns. If you’re network-bound, using more processors wouldn’t provide any benefit. If you’re CPU-bound, using more processors on the same machine would be likely to increase contention and cause more context switches. Without profiling, it’s not possible to say for sure, but I suspect that you would see lower throughput in that case.

0reactions
evgkarcommented, Oct 26, 2020

@jsquire thank you for the very useful information that you’ve provided!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Balance partition load across multiple instances - Azure ...
Describes how to balance partition load across multiple instances of your application using an event processor and the Azure Event Hubs SDK.
Read more >
How to configure EventProcessorClient to read events only ...
The partition key property will be used to identify the partition within the Queue where the message must be stored when the session-id...
Read more >
Azure Event Hubs client library for .NET - Microsoft .NET
To consume events for all partitions of an Event Hub, you'll create an EventProcessorClient for a specific consumer group. When an Event Hub...
Read more >
Azure Event Hubs client library for Java
Many Event Hub operations take place within the scope of a specific partition. Any client can call getPartitionIds() or getEventHubProperties() to get the ......
Read more >
Azure Event Hub Consumer Group with its Scenarios
Even after the Message Retention gets elapsed in EventHub, the partition events do not get cleared. Programmatically clearing up is also not ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found