question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Design Discussion] Should the Event Processor Client Support Creating Checkpoints for Arbitrary Events?

See original GitHub issue

Summary

The current design of the EventProcessorClient invokes an event handler to process data read from the Event Hubs service. Part of the arguments associated with that event handler is a method for creating/updating a checkpoint based on the EventData instance associated with the event arguments. This method takes no parameters and relies on the implicit context when called. It us not able to be used to manipulate checkpoint data for events other than the one associated with the arguments.

At present, no other means of creating/updating a checkpoint are surfaced as part of the EventProcessorClient API. This provides some difficulty for scenarios in which applications would prefer to create a checkpoint “after every XX number of events or YY amount of time has passed”, which is not an uncommon use case.

Scope of Discussion

  • Should the EventProcessorClient support a method to create/update checkpoints based on an arbitrary event?

  • Would such a method be more usable if it accepted the required data as individual arguments, an or an EventProcessorCheckpoint and offset value, or an EventProcessorCheckpoint and EventData instance?

  • Is there another potential design for enabling the scenario that should be considered?

Out of Scope

  • General changes to the EventProcessorClient unrelated to the theme of creating/updating checkpoints.

Concept Illustration

public class EventProcessorClient
{
    // This form assumes the context for the Event Hub and consumer group are sourced from the
    // EventProcessorClient and not provided individually.

    public async Task<EventHubCheckpoint> UpdateCheckpointAsync(
          string partitionId,
          long offset,
          CancellationToken cancellationToken = default);
}
var storageClient = new BlobContainerClient(<< ARGS >>);
var processor = new EventProcessorClient(storageClient, "<< CONSUMER GROUP >>", "<< CONNECTION STRING >>");

// Create a checkpoint for partition "0" using offset 12345
await processor.UpdateCheckpoint("0", 12345);

Considerations

  • The current model requires that the arguments passed to the event handler be cached for the event/partition combination that a checkpoint would be desired for. This is often something that is set on each invocation of the event handler. When the threshold for checkpointing is reached the cached arguments are referenced and the method is called.

  • The proposed concept would not remove the burden of having to track and cache information; it would, however, reduce the set of information being tracked to just the partition identifier and the offset of the desired event.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jsquirecommented, Apr 24, 2020

In your proposed solution above. How would I get the offsets? Would the burden be on the developer to track this? Why not store the array of partitions and the current offsets of each within the processor so that I can make a call based on latest offset for each partition.

That’s an interesting thought. That would also give us a query point to answer the question “what partitions are owned by the processor?” I wonder if we would want to track the event or just the offset. Something like:

public class EventProcessorClient
{
    public EventProcessorClientPartition OwnedPartitions  { get; }
}

// As suggested
public class EventProcesorClientPartition
{
    public string PartitionId { get; }
    public long LastProcessedEventOffset { get; }
}

// Alternative thought
public class EventProcesorClientPartition
{
    public string PartitionId { get; }
    public EventData LastProcessedEvent { get; }
}
0reactions
minascasioucommented, Jul 25, 2022

I’m sorely missing a timer-based setting right now. Something like … var clientOptions = new EventProcessorClientOptions { CheckpointFrequency= TimeSpan.FromMinutes(5) };

Including this simple property would save the complexity and risk of building a EventProcessor<TPartition> based solution.

Many business use cases would have quiet periods/handlers falling asleep and overnight/intra-day processing cut-offs. IMO In such cases the additional peace of mind and reduced impact on RPO and load on idempotent processing would be beneficial to the community.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Receive events using Event Processor Host - Azure
Marking a checkpoint in EventProcessorHost is accomplished by calling the CheckpointAsync method on the PartitionContext object. This operation ...
Read more >
Sample04_ProcessingEvents.md
Checkpointing is a process by which a processor records its position in the event stream for an Event Hub partition, marking which events...
Read more >
Azure Event Hubs Event Processor client library for .NET
As an event processor reads and acts on events in the partition, it should periodically create checkpoints to both mark the events as...
Read more >
understanding check pointing in eventhub
THE ANSWER​​ EventProcessor framework is meant to achieve exactly what you are looking for. Checkpoints are not persisted via Server (aka ...
Read more >
azure-eventhub
Azure Event Hubs is a highly scalable publish-subscribe service that can ingest millions of events per second and stream them to multiple consumers....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found