[Technical Question] Follow up to "How do I handle azure.iot.device send_message hanging when message quota is exhausted #775"
See original GitHub issueHi @cartertinney You provided a fix to Issue #775 . I am still having problems with this code. Should the initial issue be reopened or should a new one be created? I will briefly explain what I have been trying to do here and perhaps you can advise.
I’ll give a bit more of an overview of my project. It is a python 3 project running on a raspberry pi 4. I have a number of environmental sensors connected (temperature, humidity, light etc) and based on the readings from these sensors I control devices such as heaters, coolers, lighting etc via relays connected to output pins on the raspberry pi,
I have a pairing class that will pair a sensor with a device or two and based on the sensor reading I will control the devices paired with it. I have a number of instances of these pairings and I run each of them on a separate thread. It is within each pairing that I am trying to send telemetry data to the azure cloud using the send_message method in IoTHubDeviceClient. The send_message call is in a singleton class that gets initialised at startup
Issue #775 was raised because when I reached my message quota (8000 as Im developing on the free tier), my code froze on the send_message line of code. The solution you provided seemed to address the issue if I created my client using the connection_retry=False param ie
self.client = IoTHubDeviceClient.create_from_connection_string(self.connectionString,connection_retry=False)
What I found in my initial tests was that once the quota had been exceeded, the send_message method call would raise an exception which I was able to trap. Since reporting that all looked fine, I have been trying to run continuous tests but I am finding that I get problems after 24 around hours of continuous running (I did message on #775 but I guess you do not monitor closed threads so I will post comments below here). Sometimes Im getting errors like the one below and other time my code hangs. The hanging is a particular worry to me as if the code hangs while one of my heaters in on, then I run into a situation were my real world environment overheats causing safety concerns
I will happily provide more logging after Ive completed my current round of testing.
What I do know is that
- If I disable my send_message code, then my rig runs continuously without issue. It has currently been running for 24 hours for 4 days without issue.
- The issue does not only happen when I exceed message quota. Sometimes it occurs after only 2000 messages (all messages are small)
Could you share with me a small sample of how you would execute send_message from multiple threads (around 5 different threads) including initialising the client and checking for exceptions and reconnecting if the connection is dropped for some reason. Id really appreciate this as I am looking to do full field testing within the next 6 weeks and I need to be able have confidence that I can use the library to collect my data without safety fears of this method killing my code.
I am sure the problem will be with the way I am using the IoTHubDeviceClient. My class that actually does the sending is…
Let me know if you need more info
Example error
I left code running for 24 hours. I seems I spoke too soon as I woke this am to see logs full of these errors:
2021-06-29 04:00:58,927 MAIN_GR INFO: T_Avg (Thread-2) Reading 20.62466620889601
ReconnectStage: DisconnectEvent received while in unexpected state - LOGICALLY_DISCONNECTED, Connected: False
Exception in thread Thread-6:
Traceback (most recent call last):
File "/home/pi/.local/lib/python3.7/site-packages/azure/iot/device/iothub/sync_clients.py", line 34, in handle_result
return callback.wait_for_completion()
File "/home/pi/.local/lib/python3.7/site-packages/azure/iot/device/common/evented_callback.py", line 70, in wait_for_completion
raise self.exception
azure.iot.device.common.pipeline.pipeline_exceptions.OperationCancelled: OperationCancelled('Operation cancelled before PUBACK received')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/usr/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/home/pi/CabinetFarming/WIP/Pairings.py", line 80, in Start
self._doWork(settings)
File "/home/pi/CabinetFarming/WIP/Pairings.py", line 125, in _doWork
pairingSetpoint=setPoint
File "/home/pi/CabinetFarming/WIP/AZTelemetryService.py", line 66, in SendMessage
self.client.send_message(message)
File "/home/pi/.local/lib/python3.7/site-packages/azure/iot/device/patch_documentation.py", line 66, in send_message
return super(IoTHubDeviceClient, self).send_message(message)
File "/home/pi/.local/lib/python3.7/site-packages/azure/iot/device/iothub/sync_clients.py", line 322, in send_message
handle_result(callback)
File "/home/pi/.local/lib/python3.7/site-packages/azure/iot/device/iothub/sync_clients.py", line 56, in handle_result
raise exceptions.OperationCancelled(message="Could not complete operation", cause=e)
azure.iot.device.exceptions.OperationCancelled: OperationCancelled('Could not complete operation') caused by OperationCancelled('Operation cancelled before PUBACK received')
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (6 by maintainers)
Top GitHub Comments
I will be working on this over the weekend. I will send update
On Thu, 22 Jul 2021, 20:33 Carter Tinney, @.***> wrote:
@terryholland
I’m going to go ahead and close the issue now since I haven’t heard back, but do please open a new GitHub issue when you have reproduced the issue.