question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Delayed event hub partition claiming of EventProcessorClient

See original GitHub issue

Describe the bug We are using Azure Event Hub Service. We tested new version com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.16.0 and are observing delayed partition claiming, which in the end causes event processing delay.

Exception or Stack Trace

com.azure.storage.blob.models.BlobStorageException: Status code 412, \"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>ConditionNotMet</Code><Message>The condition specified using HTTP conditional header(s) is not met.
RequestId:......
Time:2022-10-21T15:45:32.6413117Z</Message></Error>\"
	at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(Unknown Source)
	at com.azure.core.implementation.http.rest.ResponseExceptionConstructorCache.invoke(ResponseExceptionConstructorCache.java:56)
	at com.azure.core.implementation.http.rest.RestProxyBase.instantiateUnexpectedException(RestProxyBase.java:367)
	at com.azure.core.implementation.http.rest.AsyncRestProxy.lambda$ensureExpectedStatus$1(AsyncRestProxy.java:115)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:113)
	... 77 common frames omitted
Wrapped by: java.lang.IllegalStateException: Error while claiming checkpoints
	at com.azure.messaging.eventhubs.PartitionBasedLoadBalancer.lambda$claimOwnership$24(PartitionBasedLoadBalancer.java:478)
	at reactor.core.publisher.LambdaMonoSubscriber.doError(LambdaMonoSubscriber.java:155)
	at reactor.core.publisher.LambdaMonoSubscriber.onError(LambdaMonoSubscriber.java:150)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onError(MonoFlatMap.java:172)
	at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onError(MonoCollectList.java:114)
	at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
	at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.checkTerminated(FluxFlatMap.java:842)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drainLoop(FluxFlatMap.java:608)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drain(FluxFlatMap.java:588)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.innerError(FluxFlatMap.java:863)
	at reactor.core.publisher.FluxFlatMap$FlatMapInner.onError(FluxFlatMap.java:990)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.checkTerminated(FluxFlatMap.java:842)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drainLoop(FluxFlatMap.java:608)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drain(FluxFlatMap.java:588)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onError(FluxFlatMap.java:451)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:416)
	at reactor.core.publisher.DrainUtils.postCompleteDrain(DrainUtils.java:136)
	at reactor.core.publisher.DrainUtils.postComplete(DrainUtils.java:187)
	at reactor.core.publisher.FluxMapSignal$FluxMapSignalSubscriber.onError(FluxMapSignal.java:192)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:142)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onError(MonoFlatMap.java:172)
	at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onError(FluxContextWrite.java:121)
	at reactor.core.publisher.FluxDoOnEach$DoOnEachSubscriber.onError(FluxDoOnEach.java:195)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:142)
	at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2398)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
	at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.set(Operators.java:2194)
	at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.onSubscribe(Operators.java:2068)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
	at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
	at reactor.core.publisher.FluxDelaySubscription$DelaySubscriptionMainSubscriber.onNext(FluxDelaySubscription.java:189)
	at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
	at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
	at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onNext(FluxTimeout.java:180)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:292)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:187)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
	at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
	at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onNext(FluxRetryWhen.java:174)
	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
	at reactor.core.publisher.Operators$MonoInnerProducerBase.complete(Operators.java:2664)
	at reactor.core.publisher.MonoSingle$SingleSubscriber.onComplete(MonoSingle.java:180)
	at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onComplete(MonoFlatMapMany.java:260)
	at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:144)
	at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:128)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onComplete(FluxMapFuseable.java:152)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1817)
	at reactor.core.publisher.MonoCollect$CollectSubscriber.onComplete(MonoCollect.java:160)
	at reactor.core.publisher.FluxHandle$HandleSubscriber.onComplete(FluxHandle.java:220)
	at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:275)
	at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:400)
	at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:419)
	at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:473)
	at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:703)
	at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:93)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:336)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:308)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1373)
	at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1247)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1287)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:519)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:458)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:280)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:397)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)

To Reproduce Steps to reproduce the behavior:

  • Use EventProcessorClient for partition processing (32 partitions)
  • Define Greedy loadbalancing
  • Produce on all partitions an incoming event
  • Start the EventProcessorClient and observe how some partitions are starting to process events several minutes delayed ( up to 6-16 minutes )
  • After all partitions are claimed and started processing, the exceptions are stopping

Code Snippet

Expected behavior

  • EventProcessorClients should directly claim all partitions without delay (especially with GREEDY loadbalancing)
  • Old version works as expected: com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.12.2

Screenshots

Setup (please complete the following information):

  • OS: Docker eclipse-temurin:11
  • Library/Libraries:
[INFO] +- com.azure:azure-messaging-eventhubs:jar:5.14.0:compile
[INFO] |  +- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |  \- com.azure:azure-core-amqp:jar:2.7.2:compile
[INFO] |     \- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] +- com.azure:azure-storage-blob:jar:12.20.0:compile
[INFO] |  +- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |  +- com.azure:azure-core-http-netty:jar:1.12.6:compile
[INFO] |  |  \- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |  +- com.azure:azure-storage-common:jar:12.19.0:compile
[INFO] |  |  +- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |  |  \- (com.azure:azure-core-http-netty:jar:1.12.6:compile - omitted for duplicate)
[INFO] |  \- com.azure:azure-storage-internal-avro:jar:12.5.0:compile
[INFO] |     +- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |     +- (com.azure:azure-core-http-netty:jar:1.12.6:compile - omitted for duplicate)
[INFO] |     \- (com.azure:azure-storage-common:jar:12.19.0:compile - omitted for duplicate)
[INFO] +- com.azure:azure-messaging-eventhubs-checkpointstore-blob:jar:1.16.0:compile
[INFO] |  +- (com.azure:azure-messaging-eventhubs:jar:5.14.0:compile - omitted for duplicate)
[INFO] |  \- (com.azure:azure-storage-blob:jar:12.20.0:compile - omitted for duplicate)
[INFO] +- com.azure:azure-identity:jar:1.6.1:compile
[INFO] |  +- (com.azure:azure-core:jar:1.33.0:compile - omitted for duplicate)
[INFO] |  \- (com.azure:azure-core-http-netty:jar:1.12.6:compile - omitted for duplicate)
[INFO] \- com.azure:azure-core:jar:1.33.0:compile
  • Java version: 11
  • App Server/Environment: Dropwizard, k8s
  • Frameworks:

Additional context Add any other context about the problem here.

Information Checklist

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
thomasstoermercommented, Dec 12, 2022

I can confirm, that the issue is fixed for us after switching to the latest provided versions.

1reaction
liukun-msftcommented, Nov 18, 2022

We have reverted the incorrect changes as a quick fix. Please upgrade your eventhubs to the lastest version (current is azure-messaging-eventhubs:5.15.0 and azure-messaging-eventhubs-checkpointstore-blob:1.16.1) to solve this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

EventProcessorClient with azure-messaging-eventhubs ...
Greetings, I can concur with zeljko that the message still shows up. {"az.sdk.message":"Failed to claim ownership.","exception":"Status code 412 ...
Read more >
Receive events using Event Processor Host - Azure Event Hubs
This article applies to the old version of Azure Event Hubs SDK. For current version of the SDK, see Balance partition load across...
Read more >
Azure Event Hubs Event Processor client library for .NET ...
Reading and processing events across all partitions of an Event Hub at scale with resilience to transient failures and intermittent network issues.
Read more >
azure-sdk-for-java eventhubs Partition has been lost
Yes, the EventProcessorClient in azure-messaging-eventhubs library will reconnect on such partitions. You don't need to change anything manually ...
Read more >
Azure Event Hubs - Apache Camel
Send and receive events to/from Azure Event Hubs using AMQP protocol. ... Sets the CheckpointStore the EventProcessorClient will use for storing partition ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found