question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FEATURE REQ] Better support for batch processing in Event Hubs SDK

See original GitHub issue

Library or service name. [Azure.Messaging.EventHubs]

Is your feature request related to a problem? Please describe. (This issue is prompted by the following twitter thread)

Batch-processing when receiving events in the new EventHubs SDK is much, much more difficult than in the previous SDK. When switching over I had to re-implement a lot of the functionality you used to get out of the box.

My batching workflow.

Just to sum up what sort of workflow I’m working with in regards to batching.

  • I would like to process messages in batches of n.
  • If n messages haven’t been received within 1sec, I’d like to batch process the events that are already lying around. (We have some data sources that are intermittent and we can’t just leave messeages around because new ones aren’t coming in)

I’ve followed the example here in regards to batch processing lead me to the following issues.

Problem: Managing Partition State

We have to manually manage state per-partition. You used to be able to have the SDK do that for you for batching cases. In the example listed above it seems pretty easy - you just add a ConcurrentDictionary with the partition id and whatever you want to store right? However let’s take an example where it’s not so simple.

  • Processor 1 acquires partition 1 and batches up 10 messages. This is below the processing limit.
  • Processor 1 loses partition 1. The messages are still batched.
  • At some later point in time, Processor 1 reacquires partition 1. The old messages that are left will then be reprocessed.

To avoid these sort of circumstances you’ll need to listen to all of the PartitionClosing events and ensure that you synchronize your state with it.

Problem: Heartbeat messages do not allow checkpointing.

Next up - the case where I want to process my messages after n seconds without any activity. Luckily there’s the heartbeat message for this.

Unfortunately the heartbeat message doesn’t allow us to do e.g. checkpointing or read lastEnqueuedTime, so I have to build up a structure that forces me to retrieve that from the last message I’ve enqueued.

private async Task ProcessHeartbeatEvent(IProcessEventArgs args, string partitionId, ICheckpointer checkpointer)
{
var data = _partitionedMessageBatcher.Drain(partitionId);
// If there is already no messages on the partition, this means we've gotten multiple
// heartbeat messages in a row, and there's no need to send a list of empty
// messages any further.
if (data.Count == 0)
{
    return;
}

// The updateCheckpoint we get from Azure Event Hub are coupled to the event.
// When batching we always provide the updateCheckpoint from the latest event, except
// for when we receive heartbeat messages, in which case we use the updateCheckPoint from the last "real" event
var lastRealEvent = data.Last();
Func<CancellationToken, Task> lastUpdateCheckpointAsync = lastRealEvent.UpdateCheckpointAsync;
var lastPartitionContext =
    lastRealEvent
        .Partition; // Use partitionContext from last event as well, as the heartbeat message doesn't have the correct properties such as LastEnqueuedTime


var eventData = data
    .Select(args =>
    {
        Debug.Assert(args.Data != null,
            "args.Data != null"); // Args with null data shouldn't make it into the batcher
        return args.Data!;
    })
    .ToList();

Log.Debug("Flushing {messageCount} messages due to heartbeat message", data.Count);
var receivedEventDataBatch =
    new ReceivedEventDataBatch(lastPartitionContext, eventData, lastUpdateCheckpointAsync);
await _processEvent(receivedEventDataBatch, checkpointer);
}

Perhaps this is because my codebase is shaped by the old SDK where checkpointing was done on a batch basis rather than as a function provided by each event received.

Summing up

  • Batching is harder than it needs to be. I imagine this use-case is common and I would prefer if something was provided that helped you do batching, similar to the old-style SDK.

  • However I think in lieu of that, something that will help you manage partition state would be nice. I’m not quite sure how that would work however.

  • I would also like if you were able to call UpdateCheckpointAsync on a heartbeat message, and that would then checkpoint all the previous messages.

  • If nothing else, a more involved example in the documentation would be nice.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:5
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
jsquirecommented, Mar 24, 2022

Hi folks,

Apologies for the lack of updates. Thanks in no small part to the discussion on this issue, we were able to prove the need for a better story around extending the processor and batch support. Starting with our next release, v5.7.0-beta.5, we’ve made the following improvements:

  • The Azure.Messaging.EventHubs package now defines a CheckpointStore type to normalize processor storage operations.

  • The Azure.Messaging.EventHubs package includes a PluggableCheckpointStoreEventProcessor<T> that can be extended with your processing logic without the need to implement storage operations.

  • The Blob Storage implementation used by the EventProcessorClient is now public in the Azure.Messaging.EventHubs.Processor as BlobCheckpointStore, and can be used when extending processor types.

  • All event processor types now expose a protected UpdateCheckpoint member that can types extending them can call. This new method does not require an EventData instance to create the checkpoint, only an offset.

More details can be found in this sample.

1reaction
amshalevcommented, Nov 1, 2021

Thanks for the quick reply! I see… the problem with using the EventProcessor<T> is that it requires to rewrite\copy a lot of code which already exists in EventProcessorClient, however, after looking on the code it seems that using the BlobsCheckpointStore will make it much easier. But I was disappointed to find that BlobsCheckpointStore class marked as internal, could you consider change it to be public so it will be possible to consume it directly?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Send batch events
Sends a new batched message event to an Event Hub. Batching reduces the number of messages that are transmitted by merging information from ......
Read more >
Features and terminology in Azure Event Hubs
Azure Event Hubs is a scalable event processing service that ingests and processes ... regardless of whether it's a single event or a...
Read more >
[QUERY] EventHub batch-processing API future #9455
Some context: I'm working on distributed tracing from Azure Monitor side and want to understand usage patterns for the new SDK to plan ......
Read more >
Support a MinBatchSize property in web job event hub ...
Please describe. I have a scenario where I would like to aggregate events over time, and then handle them as a batch. The...
Read more >
Azure Event Hubs and its Complete Overview
Azure Event Hubs monitoring and management challenges are solved by using Serverless360. Read to know more about the complete overview of Azure Event...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found