[BUG] Error in Event Hub SDK version 5.1.1 ("There are no partitions in Event Hub")
See original GitHub issueDescribe the bug
An error occurs when collecting data using the Event Hub SDK from the built-in endpoint of the Azure IoT hub. It works normally in Local, but when running in Azure Web App(Windows), an error occurs after a certain time (about 5 minutes). The error occurs when the connection to the blob is lost and the partition information cannot be found.
Exception or Stack Trace
The following connection error occurs first regarding blobs.
WARN 20-06-29 22:00:45[reactor-http-nio-4] [HttpClientConnect:299] - [id: 0x91a7478a, L:/10.30.222.205:49449 - R:sabiotdevkrciothublog.blob.core.windows.net/52.239.190.132:443] The connection observed an error
java.io.IOException: Operation timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
WARN 20-06-29 22:00:53[reactor-http-nio-9] [HttpClientConnect:299] - [id: 0x9511e7bc, L:/10.30.222.205:49524 - R:sabiotdevkrciothublog.blob.core.windows.net/52.239.190.132:443] The connection observed an error
java.io.IOException: Operation timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377)
at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
When the above error occurs, an error occurs because the partition ID of the Event Hub cannot be found from the blob as shown below.
WARN 20-06-29 10:14:17[parallel-2] [PartitionBasedLoadBalancer:145] - Unable to get partitionIds from eventHubAsyncClient.
INFO 20-06-29 10:14:17[parallel-2] [PartitionBasedLoadBalancer:312] - Starting load balancer for 83cc1455-79ad-4ebd-8347-db5cbb85dd71
ERROR 20-06-29 10:14:17[parallel-2] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
WARN 20-06-29 10:14:17[parallel-2] [PartitionBasedLoadBalancer:318] - Load balancing for event processor failed - There are no partitions in Event Hub iothub-biot-kim-krc
There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:14:17[parallel-2] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:14:27[single-1] [ReactorReceiver:324] - connectionId[MF_778ef6_1593424938060] linkName[17_ea8dcf_1593424938076] entityPath[iothub-biot-kim-krc/ConsumerGroups/bespintest_1/Partitions/17] Error occurred in link.
The connection was inactive for more than the allowed 240000 milliseconds and is closed by container 'b200e2e624c245f8ae8b81abcce65f76_G1'., errorContext[NAMESPACE: iothub-ns-iothub-bio-3556112-b28e37bac0.servicebus.windows.net, PATH: iothub-biot-kim-krc/ConsumerGroups/bespintest_1/Partitions/17, REFERENCE_ID: 17_ea8dcf_1593424938076, LINK_CREDIT: 500]
ERROR 20-06-29 10:14:27[parallel-3] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:14:32[parallel-4] [ReactorConnection:324] - connectionId[MF_778ef6_1593424938060]: Connection is disposed. Cannot get CBS node.
ERROR 20-06-29 10:14:37[parallel-4] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:14:47[parallel-1] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:14:57[parallel-2] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:15:07[parallel-4] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:15:17[parallel-4] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:15:27[single-1] [ReactorReceiver:324] - connectionId[MF_604810_1593424958310] linkName[29_e66659_1593424958310] entityPath[iothub-biot-kim-krc/ConsumerGroups/bespintest_1/Partitions/29] Error occurred in link.
The connection was inactive for more than the allowed 240000 milliseconds and is closed by container 'a3e27f34ff6348599cdaf983c168e21c_G4'., errorContext[NAMESPACE: iothub-ns-iothub-bio-3556112-b28e37bac0.servicebus.windows.net, PATH: iothub-biot-kim-krc/ConsumerGroups/bespintest_1/Partitions/29, REFERENCE_ID: 29_e66659_1593424958310, LINK_CREDIT: 500]
ERROR 20-06-29 10:15:27[parallel-3] [PartitionBasedLoadBalancer:324] - There are no partitions in Event Hub iothub-biot-kim-krc
ERROR 20-06-29 10:15:32[parallel-3] [ReactorConnection:324] - connectionId[MF_604810_1593424958310]: Connection is disposed. Cannot get CBS node.
To Reproduce Steps to reproduce the behavior:
Code Snippet
The summarized code is below.
public class MessageService {
private final String eventHubConnectionString = "===============";
private final String eventHubName = "===============";
private final String storageConnectionString ="===============";
private final String storageBlobContainerName = "===============";
public void receiveMessageFromIoTHub() throws IOException, ExecutionException, InterruptedException {
ServiceClient iotHubServiceClient = ServiceClient.createFromConnectionString(iotHubConnectionString, IotHubServiceClientProtocol.AMQPS);
CompletableFuture<Void> connect = iotHubServiceClient.openAsync();
connect.get();
System.out.println("********* Successfully created an IoT Hub ServiceClient.");
Consumer<InitializationContext> eventhubProcessPartitionInitialization = initializationContext -> System.out.println("Initialized partition: " + initializationContext.getPartitionContext().getPartitionId());
Consumer<ErrorContext> eventhubProcessError = errorContext -> System.out.println(errorContext.getThrowable().getMessage());
Consumer<EventContext> eventhubProcessEvent = eventContext -> {
try{
String receivedMessage = eventContext.getEventData().getBodyAsString();
log.debug(String.format("Message Received : %s \n", receivedMessage));
LocalDateTime messageInTime = new Timestamp(System.currentTimeMillis()).toLocalDateTime();
eventContext.updateCheckpoint();
} catch (Exception ex) {
eventContext.updateCheckpoint();
ex.printStackTrace();
}
};
BlobContainerAsyncClient blobContainerAsyncClient = new BlobContainerClientBuilder()
.connectionString(storageConnectionString)
.containerName(storageBlobContainerName)
.buildAsyncClient();
EventProcessorClient eventhubProcessorClient = new EventProcessorClientBuilder()
.connectionString(eventHubConnectionString, eventHubName)
.processEvent(eventhubProcessEvent)
.processError(eventhubProcessError)
.processPartitionInitialization(eventhubProcessPartitionInitialization)
.consumerGroup("testgroup")
.checkpointStore(new BlobCheckpointStore(blobContainerAsyncClient))
.buildEventProcessorClient();
System.out.println("Starting event processor");
eventhubProcessorClient.start();
}
}
Expected behavior
I don’t want the problem to occur in the SDK version below.
Azure.Messaging.EventHubs version: 'com.azure:azure-messaging-eventhubs:5.1.1'
Azure.Messaging.EventHubs.Checkpointstore version: 'com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.1.1'
Azure.Storage.Blob version: 'com.azure:azure-storage-blob:12.6.0'
In addition, the problem is not occurring in the version below. Please check what is the difference between the two versions, and why the problem only occurs in the latest version.
Azure.Messaging.EventHubs version: 'com.azure:azure-messaging-eventhubs:5.0.3'
Azure.Messaging.EventHubs.Checkpointstore version: 'com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.0.3'
Azure.Storage.Blob version: 'com.azure:azure-storage-blob:12.6.0'
Screenshots If applicable, add screenshots to help explain your problem.
Setup (please complete the following information):
- OS: Windows (Azure Web App)
- IDE : Spring Tool Suite
- Version of the Library used JAVA version: JAVA 8 Azure.Messaging.EventHubs version: ‘com.azure:azure-messaging-eventhubs:5.1.1’ Azure.Messaging.EventHubs.Checkpointstore version: ‘com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.1.1’ Azure.Storage.Blob version: ‘com.azure:azure-storage-blob:12.6.0’
Additional context
In addition, the following error occurred in the Event Hub SDK in the middle.
INFO 20-07-01 08:07:33[pool-12-thread-1] [PartitionBasedLoadBalancer:335] - Starting load balancer for aea256d3-d9cb-4628-9591-b98abb79b558
ERROR 20-07-01 08:07:33[reactor-http-nio-1] [JacksonAdapter:347] - Unexpected first character (char code 0xEF), not valid in xml document: could be mangled UTF-8 BOM marker. Make sure that the Reader uses correct encoding or pass an InputStream instead
WARN 20-07-01 08:07:33[reactor-http-nio-1] [PartitionBasedLoadBalancer:341] - Load balancing for event processor failed - HTTP response has a malformed body.
HTTP response has a malformed body.
INFO 20-07-01 08:07:43[pool-12-thread-1] [PartitionBasedLoadBalancer:335] - Starting load balancer for aea256d3-d9cb-4628-9591-b98abb79b558
ERROR 20-07-01 08:07:43[reactor-http-nio-1] [JacksonAdapter:347] - Unexpected first character (char code 0xEF), not valid in xml document: could be mangled UTF-8 BOM marker. Make sure that the Reader uses correct encoding or pass an InputStream instead
WARN 20-07-01 08:07:43[reactor-http-nio-1] [PartitionBasedLoadBalancer:341] - Load balancing for event processor failed - HTTP response has a malformed body.
HTTP response has a malformed body.
INFO 20-07-01 08:07:53[pool-12-thread-1] [PartitionBasedLoadBalancer:335] - Starting load balancer for aea256d3-d9cb-4628-9591-b98abb79b558
ERROR 20-07-01 08:07:53[reactor-http-nio-1] [JacksonAdapter:347] - Unexpected first character (char code 0xEF), not valid in xml document: could be mangled UTF-8 BOM marker. Make sure that the Reader uses correct encoding or pass an InputStream instead
WARN 20-07-01 08:07:53[reactor-http-nio-1] [PartitionBasedLoadBalancer:341] - Load balancing for event processor failed - HTTP response has a malformed body.
HTTP response has a malformed body.
Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
- [v] Bug Description Added
- [v] Repro Steps Added
- [v] Setup information Added
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:5 (1 by maintainers)
I’m facing the same issue on an Azure App as well.
I updated the version from 5.1.0 to 5.1.1 due to the bug discussed in https://github.com/Azure/azure-sdk-for-java/issues/8439. when i was on version 5.1.0, i was also able to reproduce this bug locally. But now, with 5.1.1 I’m not able to reproduce it locally. But the problem occurs on Azure cloud and some times one or more partitions end up getting stuck.
This issue is still reproducible with JAVA version: JAVA 8 Azure.Messaging.EventHubs version: ‘com.azure:azure-messaging-eventhubs:5.1.1’ Azure.Messaging.EventHubs.Checkpointstore version: ‘com.azure:azure-messaging-eventhubs-checkpointstore-blob:1.1.1’ Azure.Storage.Blob version: ‘com.azure:azure-storage-blob:12.7.0’