The Edge Hub reports unacknowledged messages without any error raised in the module
See original GitHub issueI was trying to measure the maximum throughput I could expect using messages with routing.
I have two simple edge modules:
- Producer - Listens for messages using a
ModuleClient.SetInputMessageHandlerAsync(...)
and responds with a message containing the current time in UTC in a JSON object. - Consumer - Sends a message to the producer module with the
CorrelationId
set to current time in UTC, and listens for a response message using aModuleClient.SetInputMessageHandlerAsync(...)
, and measures the latency between two modules.
All code is available in my GitHub repo here: mill5james/IoTEdgeMethodVsMessage
And the images are published in Docker Hub at mill5james
When sending only messages between modules, I will repeatedly see unacknowledged messages in the Edge Hub without any error in the producer or consumer. These will be followed by exceptions from the MQTT stack. I have no clue as to the periodicity of these exceptions.
2019-06-16 21:06:13.114 +00:00 [WRN] - Error sending messages to module jamesp-iotedge2/producer
System.TimeoutException: Message completion response not received
at Microsoft.Azure.Devices.Edge.Hub.Core.Device.DeviceMessageHandler.SendMessageAsync(IMessage message, String input) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/device/DeviceMessageHandler.cs:line 363
at Microsoft.Azure.Devices.Edge.Hub.Core.Routing.ModuleEndpoint.ModuleMessageProcessor.ProcessAsync(ICollection`1 routingMessages, IDeviceProxy dp, CancellationToken token) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/routing/ModuleEndpoint.cs:line 164
2019-06-16 21:06:14.139 +00:00 [WRN] - Closing connection for device: jamesp-iotedge2/consumer, scope: ExceptionCaught, DotNetty.Codecs.DecoderException: [MQTT-2.3.1-1]
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePacketIdVariableHeader(IByteBuffer buffer, PacketWithId packet, Int32& remainingLength)
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePublishPacket(IByteBuffer buffer, PublishPacket packet, Int32& remainingLength)
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePacketInternal(IByteBuffer buffer, Int32 packetSignature, Int32& remainingLength, IChannelHandlerContext context)
at DotNetty.Codecs.Mqtt.MqttDecoder.TryDecodePacket(IByteBuffer buffer, IChannelHandlerContext context, Packet& packet)
at DotNetty.Codecs.Mqtt.MqttDecoder.Decode(IChannelHandlerContext context, IByteBuffer input, List`1 output)
at DotNetty.Codecs.ReplayingDecoder`1.CallDecode(IChannelHandlerContext context, IByteBuffer input, List`1 output)
at DotNetty.Codecs.ByteToMessageDecoder.ChannelRead(IChannelHandlerContext context, Object message)
at DotNetty.Transport.Channels.AbstractChannelHandlerContext.InvokeChannelRead(Object msg), 6cfe9aea
2019-06-16 21:06:14.140 +00:00 [INF] - Disposing MessagingServiceClient for device Id jamesp-iotedge2/consumer because of exception - DotNetty.Codecs.DecoderException: [MQTT-2.3.1-1]
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePacketIdVariableHeader(IByteBuffer buffer, PacketWithId packet, Int32& remainingLength)
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePublishPacket(IByteBuffer buffer, PublishPacket packet, Int32& remainingLength)
at DotNetty.Codecs.Mqtt.MqttDecoder.DecodePacketInternal(IByteBuffer buffer, Int32 packetSignature, Int32& remainingLength, IChannelHandlerContext context)
at DotNetty.Codecs.Mqtt.MqttDecoder.TryDecodePacket(IByteBuffer buffer, IChannelHandlerContext context, Packet& packet)
at DotNetty.Codecs.Mqtt.MqttDecoder.Decode(IChannelHandlerContext context, IByteBuffer input, List`1 output)
at DotNetty.Codecs.ReplayingDecoder`1.CallDecode(IChannelHandlerContext context, IByteBuffer input, List`1 output)
at DotNetty.Codecs.ByteToMessageDecoder.ChannelRead(IChannelHandlerContext context, Object message)
at DotNetty.Transport.Channels.AbstractChannelHandlerContext.InvokeChannelRead(Object msg)
Unfortunately, I am unsure if a message has been dropped.
Expected Behavior
Sending messages between modules should succeed when the edge hub is performing normally.
Current Behavior
The Edge Hub raises warnings to it’s logs. It is unclear if any messages were lost between modules.
Steps to Reproduce
- Download the deployment.json from my IoTEdgeMethodVsMessage GitHub repo
- Modify the
deployment.json
for the consumer to only enable messages by setting theEnableMethod
tofalse
and theEnableMessage
totrue
"consumer": {
"settings": {
"image": "mill5james/consumer:latest",
"createOptions": "{}"
},
"type": "docker",
"env": {
"EnableMethod": {
"value": "false"
},
"EnableMessage": {
"value": "true"
}
},
"version": "1.0",
"status": "running",
"restartPolicy": "always"
}
- Use the modified
deployment.json
to deploy to an IoT Edge device - Observe the logs for the
edgeHub
module on the edge to see the exceptions being thrown in the module
Context (Environment)
Output of iotedge check
iotedge check
Configuration checks
--------------------
√ config.yaml is well-formed
√ config.yaml has well-formed connection string
√ container engine is installed and functional
√ config.yaml has correct hostname
√ config.yaml has correct URIs for daemon mgmt endpoint
√ latest security daemon
√ host time is close to real time
√ container time is close to host time
‼ DNS server
Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
You can ignore this warning if you are setting DNS server per module in the Edge deployment.
‼ production readiness: certificates
Device is using self-signed, automatically generated certs.
Please see https://aka.ms/iotedge-prod-checklist-certs for best practices.
√ production readiness: certificates expiry
√ production readiness: container engine
‼ production readiness: logs policy
Container engine is not configured to rotate module logs which may cause it run out of disk space.
Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
You can ignore this warning if you are setting log policy per module in the Edge deployment.
Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with IoT Hub AMQP port
√ host can connect to and perform TLS handshake with IoT Hub HTTPS port
√ host can connect to and perform TLS handshake with IoT Hub MQTT port
√ container on the default network can connect to IoT Hub AMQP port
√ container on the default network can connect to IoT Hub HTTPS port
√ container on the default network can connect to IoT Hub MQTT port
√ container on the IoT Edge module network can connect to IoT Hub AMQP port
√ container on the IoT Edge module network can connect to IoT Hub HTTPS port
√ container on the IoT Edge module network can connect to IoT Hub MQTT port
√ Edge Hub can bind to ports on host
Device (Host) Operating System
Ubuntu 18.04 LTS
Architecture
amd64
Container Operating System
Linux containers
Runtime Versions
iotedged
iotedge 1.0.7.1 (f7c51d92be8336bc6be042e1f1f2505ba01679f3)
Edge Agent
mcr.microsoft.com/azureiotedge-agent:1.0 Version - 1.0.7.1.22377503 (f7c51d92be8336bc6be042e1f1f2505ba01679f3)
Edge Hub
mcr.microsoft.com/azureiotedge-hub:1.0 Version - 1.0.7.1.22377503 (f7c51d92be8336bc6be042e1f1f2505ba01679f3)
Docker
Docker version
Client:
Version: 3.0.5
API version: 1.40
Go version: go1.12.1
Git commit: ba9934d4
Built: Thu Apr 18 22:01:41 2019
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 3.0.5
API version: 1.40 (minimum version 1.12)
Go version: go1.12.1
Git commit: dbe4a30
Built: Thu Apr 18 22:07:58 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.2.5
GitCommit: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc:
Version: 1.0.0-rc6+dev
GitCommit: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
docker-init:
Version: 0.18.0
GitCommit: fec3683
Logs
iotedged logs
<Paste here>
edge-agent logs
<Paste here>
edge-hub logs
<Paste here>
Additional Information
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:6 (3 by maintainers)
Top GitHub Comments
@mill5james we found the reason of this error. There was an id counting error in the library - sending through 0x8000 messages within an hour lead to this problem. It will be fixed with the following releases.
@darobs If there is anything additional you need from me, just reach out. Glad to help.