question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Python freeze/hang on exit

See original GitHub issue

Context

Hello,

In a batch manager project, we are using this python client to submit jobs to the Kubernetes API. In one word, the python application loads the library, submit the job, watch the related events, clean/delete the job, then return the succeeded or failed status. But sometimes, the application hang at the exit.

After investigation, it seems the ThreadPool used in the ApiClient class is not properly clean on the python process exit.

Reproduce

The easiest way to reproduce is to run this snippet:

python_version=3.6
kubernetes_version=4.0.0

docker run --name testing --rm -it --entrypoint "" python:$python_version /bin/bash -c "
pip install 'kubernetes==$kubernetes_version'
while true; do echo ===;
  for i in {0..50}; do python -c '

from kubernetes import client
coreapi = client.CoreV1Api()
print(0)' &

  done
wait
done"

This will run Python in a Docker container, install the Kubernetes python module, then run the test indefinitely. The test starts a simple application 50 times in order to increase the probability. This application loads the Kubernetes python module, create a CoreV1Api, which creates its ApiClient (with Async enabled using ThreadPool), then print 0 showing the freeze occured during the python exit sequence.

To stop the test:

docker rm -f testing

Expected:

This code should run indefinitely.

Result:

The loop hang on list of 0 after some time.

Workaround:

To avoid this issue, we override the ApiClient class to disable Async / ThreadPool feature. It seems to work without any issues so far. Downside is we are loosing the Async mode.

Thank you.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:9
  • Comments:20 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
RobbieClarkencommented, Dec 7, 2017

This seems to be related to the __del__ method on ApiClient cleaning up its ThreadPool: https://github.com/kubernetes-incubator/client-python/blob/30b8ee44f4d14546e651dead91306719d53f8c37/kubernetes/client/api_client.py#L76-L78

This can cause a deadlock when the api clients are garbage collected as Python exits. I can reproduce with the following:

# deadlock.py
from multiprocessing.pool import ThreadPool

class Deadlocker:
    def __init__(self):
        self.pool = ThreadPool()

    def __del__(self):
        self.pool.close()
        self.pool.join()

d1 = Deadlocker()
d2 = Deadlocker()
print('exiting...')

and running:

while true; do python3 deadlock.py; done

On macOS 10.12.6 with Python 3.6.3, after 1-50 executions, it will print “exiting…” and stall. The python process won’t ever terminate until you hit ctrl-c. I can also reproduce on Linux but it seems to be less frequent that on macOS.

This means a simple script like this:

from kubernetes import client
coreapi = client.CoreV1Api()
batchapi = client.BatchV1Api()

may never terminate because CoreV1Api and BatchV1Api will both instantiate ApiClients which have the problematic __del__ method. We can reduce the likelihood of a deadlock by creating a single ApiClient and passing it into CoreV1Api and BatchV1Api but the problem doesn’t go away entirely. There are also some classes like Watch that always instantiate their own ApiClient.

I wonder whether the multiprocessing.pool.ThreadPool class is suitable for production use cases. According to a stackoverflow comment I came across:

The multiprocessing.pool.ThreadPool is not documented as its implementation has never been completed. It lacks tests and documentation.

1reaction
k8s-ci-robotcommented, Feb 29, 2020

@furkanmustafa: You can’t reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to stop python script from hanging when used ...
Try this way: import os import threading def noHangFunction(): os.system("whoami") #os.system or subprocess is similar result ...
Read more >
Multiprocessing Freeze Support in Python
In this tutorial you will discover how to add freeze support for multiprocessing in your Python program. Let's get started.
Read more >
How Do You End Scripts in Python? - LearnPython.com
Another way to terminate a Python script is to interrupt it manually using the keyboard. Ctrl + C on Windows can be used...
Read more >
IDLE: Freeze when closing Settings (& About) dialog on MacOS
This problem shows on macOS High Sierra, Python 3.6.N and 3.7.N (and even 3.5.N), with the provided installers (64 bit, 64/32 bit, Tk...
Read more >
Python sleep(): How to Add Time Delays to Your Code
You'll use decorators and the built-in time module to add Python sleep() calls ... to wait for the Python sleep() call to finish...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found