question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Python Interpreter shutdown when using multithreading

See original GitHub issue
  • OS and version used: Ubuntu 16.04.3

  • Python runtime used: Python 3.5.2

  • SDK version used: azure-iothub-device-client==1.3.5

Description of the issue:

When sending data to an IoT-Hub using the azure-iothub-device-client lib the python interpreter crashes after some undefined period of time when using multithreading. This behaviour is 100% reproducible but the time it takes for it to happen is varied. When using more threads and sending more frequently it seems that it will happen faster. When using 50 threads and sending every second it crashed after about 30 minutes.

The error one gets is simply: Fatal Python error: GC object already tracked

Sometimes this comes with a SIGABRT but not always.

Since I ran python with gdb I have attached a backtrace will all the info I think the most important parts are at the top - I also included a py-bt of all threads which most of them are just sleeping gdb.txt

Code sample exhibiting the issue:

import threading
import json
import time
import random

from iothub_client import IoTHubClient, IoTHubTransportProvider, IoTHubMessage, IoTHubError


class AzureThreadingTest(threading.Thread):
    def __init__(self, azure_config_file, name):
        super().__init__(name=name)
        self.MSG_COUNT = 0
        self.SEND_CALLBACKS = 0

        self.sleep_time = 1
        self.message_print_time = 30
        self.name = name

        with open(azure_config_file) as json_data:
            self.azure_config = json.load(json_data)

        self.iothub_client = self._iothub_client_init()

    def _iothub_client_init(self):
        connection_string = self.azure_config["iot_hub"]["ConnectionString"]
        protocol = self.azure_config["iot_hub"]["Options"]["PROTOCOL"]
        if protocol == "HTTP":
            protocol_class = IoTHubTransportProvider.HTTP
        elif protocol == "AMQP":
            protocol_class = IoTHubTransportProvider.AMQP
        elif protocol == "MQTT":
            protocol_class = IoTHubTransportProvider.MQTT
        else:
            protocol_class = IoTHubTransportProvider.HTTP
        client = IoTHubClient(connection_string, protocol_class)
        client.set_option("messageTimeout", self.azure_config["iot_hub"]["Options"]["MESSAGE_TIMEOUT"])

        if client.protocol == IoTHubTransportProvider.HTTP:
            timeout = 241000
            minimum_polling_time = 9
            client.set_option("timeout", timeout)
            client.set_option("MinimumPollingTime", minimum_polling_time)

        if client.protocol == IoTHubTransportProvider.MQTT:
            client.set_option("logtrace", 0)

        return client

    def _send_confirmation_callback(self, message, result, user_context):
        user_context.SEND_CALLBACKS += 1

    def _send_to_azure(self, msg):
        try:
            msg = IoTHubMessage(bytearray(str(msg), 'utf8'))
            self.MSG_COUNT += 1
            self.iothub_client.send_event_async(msg, self._send_confirmation_callback, self)
        except IoTHubError as iothub_error:
            print("Unexpected error {} from IoTHub".format(iothub_error))

    def run(self):
        while True:
            try:
                data = {'some_data': random.random()}
                vs = json.dumps(data)
                if self.MSG_COUNT % self.message_print_time == 0:
                    print("Thread {} sent {}".format(self.name, vs))
                self._send_to_azure(vs)
                time.sleep(self.sleep_time)
            except Exception as e:
                print(e)


if __name__ == '__main__':
    thread_count = 50
    for i in range(thread_count):
        test = AzureThreadingTest('azure_config.json', str(i))
        test.daemon = True
        print('Starting thread {}'.format(i))
        test.start()
        time.sleep(1)

    while True:
        pass

The code is pretty simple - just sets up the connection and sends random data.

The protocol we are using is HTTP - the rest of the config file are just the connection strings and shouldn’t matter for the issue.

I think the issues lies with the callback handling in the send_event_async method as I am not sure if the GIL is done correctly - but I am more or less just blindly guessing

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:17 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
pierrecacommented, Aug 29, 2019

@LouanDuToitS3 @Cnidarias @uitdam @bulletlink @jackt-moran sorry for the lack of communication or solutions until now. A big part of the reason why we haven’t addressed this is because we’ve been busy rewriting the whole SDK in pure python. we realized last fall after taking a good look at issues (especially related to native code behaviors , import errors and platform support) that the way the v1 of the SDK was built was problematic and definitely not future-proof. Getting to the point where we could release a new SDK that offers a better experience for python users took longer than expected and we definitely should have communicated this more openly.

While it may be “too little too late” at that point, i’m also hoping we’re still “better late than never” at least for folks who’ve experienced this issue recently. We’ve reached a point where we feel confident pushing the v2 of the SDK into master and previewing it with customers. The API are a lot more pythonic, and if bugs are found they will be fixed much faster because the environment is much friendlier for pythonistas and we don’t depend on another team to build the product.

I hope that what you’ll find in master today (and in the azure-iot-device package on PyPI) will suit you a lot more than what we had before. please do let us know.

0reactions
az-iot-builder-01commented, Dec 17, 2019

@bulletlink, @jackt-moran, @uitdam, @Cnidarias, @LouanDuToitS3, @BertKleewein, thank you for your contribution to our open-sourced project! Please help us improve by filling out this 2-minute customer satisfaction survey

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python threading: interpreter shutdown exception
You are currently busy-waiting for the thread to return false from isAlive() (which should be is_alive() , btw), using join() is cleaner and ......
Read more >
The python interpreter crashed with "_enter_buffered_busy"
BufferedWriter name='<stderr>'> at interpreter shutdown, possibly due to daemon threads Python runtime state: finalizing (tstate=0xd0c180) ...
Read more >
Python Multithreading and Multiprocessing Tutorial - Toptal
Threading is just one of the many ways concurrent programs can be built. In this article, we will take a look at threading...
Read more >
Benign but annoying exception during interpreter shutdown
Would it be possible to try your script with the latest python (2.7.9)? Are you multi-processing or multi-threading, using drmaa?
Read more >
Python Multithreading Tutorial: daemon threads & join method
To wait until a daemon thread has completed its work, we may want to use join() method. import threading import time import logging...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found