question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ProvisioningDeviceClient leaking file descriptors on failure to provision

See original GitHub issue

Context

  • OS and version used: Ubuntu Core
  • Python version: 3.8.5
  • azure-iot-device: v2.9.0

Description of the issue

I have been running into file descriptors building up over time during connection/disconnect cycles using the Python Azure SDK. I’ve narrowed down at least one of those leaks to the ProvisioningDeviceClient process. I have been using: ProvisioningDeviceClient.create_from_x509_certificate() and/or ProvisioningDeviceClient.create_from_symmetric_key() to create a provisioning client, then calling: await client.register()

in order to reliably cause a failure to provision, I block the device from the Azure IoTCentral portal from the manage device menu Then have my system repeatedly retry provisioning Since the device is blocked, this fails with a client error: ClientError('Unexpected failure') caused by ServiceError('Query Status operation returned a failed registration status with a status code of 200') which is expected. however, every time this happens it appears that some file descriptors are left open. this can be seen by running lsof. I see a build up that looks like:

python3 7082 root 990u IPv4 4183756 0t0 TCP localhost:39219 (LISTEN)
python3 7082 root 991u IPv4 3752021 0t0 TCP localhost:46158->localhost:41733 (ESTABLISHED)
python3 7082 root 992u IPv4 3752022 0t0 TCP localhost:41733->localhost:46158 (ESTABLISHED)
python3 7082 root 993u IPv4 2219704 0t0 TCP localhost:38190->localhost:39981 (ESTABLISHED)
python3 7082 root 994u IPv4 2219705 0t0 TCP localhost:39981->localhost:38190 (ESTABLISHED)
python3 7082 root 995u IPv4 2228111 0t0 TCP <USER>:38373->23.96.222.45:https (ESTABLISHED)
python3 7082 root 996u IPv4 2228107 0t0 TCP localhost:36778->localhost:34745 (ESTABLISHED)
python3 7082 root 997u IPv4 2228108 0t0 TCP localhost:34745->localhost:36778 (ESTABLISHED)
python3 7082 root 998u IPv4 2233216 0t0 TCP <USER>:40459->20.49.99.105:https (ESTABLISHED)
python3 7082 root 999u IPv4 2224629 0t0 TCP localhost:38436->localhost:45177 (ESTABLISHED)
python3 7082 root 1000u IPv4 2224630 0t0 TCP localhost:45177->localhost:38436 (ESTABLISHED)
python3 7082 root 1001u IPv4 2242130 0t0 TCP <USER>:37297->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1002u IPv4 2239931 0t0 TCP localhost:35740->localhost:44409 (ESTABLISHED)
python3 7082 root 1003u IPv4 2239932 0t0 TCP localhost:44409->localhost:35740 (ESTABLISHED)
python3 7082 root 1004u IPv4 2246264 0t0 TCP <USER>:55565->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1005u IPv4 2244990 0t0 TCP localhost:33896->localhost:41849 (ESTABLISHED)
python3 7082 root 1006u IPv4 2244991 0t0 TCP localhost:41849->localhost:33896 (ESTABLISHED)
python3 7082 root 1007u IPv4 2888225 0t0 TCP <USER>:34721->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1008u IPv4 2257048 0t0 TCP localhost:41728->localhost:37823 (ESTABLISHED)
python3 7082 root 1009u IPv4 2257049 0t0 TCP localhost:37823->localhost:41728 (ESTABLISHED)
python3 7082 root 1010u IPv4 2262215 0t0 TCP <USER>:37229->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1011u IPv4 2262213 0t0 TCP localhost:57022->localhost:35511 (ESTABLISHED)
python3 7082 root 1012u IPv4 2263728 0t0 TCP localhost:35511->localhost:57022 (ESTABLISHED)
python3 7082 root 1013u IPv4 2482462 0t0 TCP <USER>:52517->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1014u IPv4 2271152 0t0 TCP localhost:37584->localhost:37143 (ESTABLISHED)
python3 7082 root 1015u IPv4 2271153 0t0 TCP localhost:37143->localhost:37584 (ESTABLISHED)
python3 7082 root 1016u IPv4 2277756 0t0 TCP <USER>:51283->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1017u IPv4 2280846 0t0 TCP localhost:55030->localhost:45309 (ESTABLISHED)
python3 7082 root 1018u IPv4 2280847 0t0 TCP localhost:45309->localhost:55030 (ESTABLISHED)
python3 7082 root 1019u IPv4 2278378 0t0 TCP <USER>7:43427->20.49.99.105:https (ESTABLISHED)
python3 7082 root 1020u IPv4 2285742 0t0 TCP localhost:42732->localhost:36379 (ESTABLISHED)
python3 7082 root 1021u IPv4 2285743 0t0 TCP localhost:36379->localhost:42732 (ESTABLISHED)
python3 7082 root 1022u IPv4 3342593 0t0 TCP localhost:44599 (LISTEN)
python3 7082 root 1023u IPv4 3341441 0t0 TCP localhost:54166->localhost:44599 (ESTABLISHED)

this eventually hits the per-process limit and no more file descriptors can be opened.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
jonathon793commented, Nov 4, 2021

thanks for that information! that’s really good to know/understand. I’ll try updating my code to handle that behavior.

1reaction
jonathon793commented, Nov 9, 2021

yes, thank you very much!

Read more comments on GitHub >

github_iconTop Results From Across the Web

ProvisioningDeviceClient leaking file descriptors on failure to ...
I have been running into file descriptors building up over time during connection/disconnect cycles using the Python Azure SDK.
Read more >
Is there a way to track leaking file descriptors?
Click on Filesystem Activity track on the top;; Click on Filesystem Statistics popup menu on the toolbar of details below;; Select File Descriptor...
Read more >
428837 – leaking file descriptors - Red Hat Bugzilla
It appears that the leaking is triggered by having to reconnect to the LDAP server. Test machine marvin has just rebooted. The lsof...
Read more >
CWE-773: Missing Reference to Active File Descriptor or Handle
The software does not properly maintain references to a file descriptor or handle, which prevents that file descriptor/handle from being reclaimed.
Read more >
File Descriptor Management - Fedora Docs
Preventing File Descriptor Leaks to Child Processes; Dealing with the select Limit ... and all the file and network communication facilities provided by...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found