Python freeze/hang on exit
See original GitHub issueContext
Hello,
In a batch manager project, we are using this python client to submit jobs to the Kubernetes API. In one word, the python application loads the library, submit the job, watch the related events, clean/delete the job, then return the succeeded or failed status. But sometimes, the application hang at the exit.
After investigation, it seems the ThreadPool used in the ApiClient class is not properly clean on the python process exit.
Reproduce
The easiest way to reproduce is to run this snippet:
python_version=3.6
kubernetes_version=4.0.0
docker run --name testing --rm -it --entrypoint "" python:$python_version /bin/bash -c "
pip install 'kubernetes==$kubernetes_version'
while true; do echo ===;
for i in {0..50}; do python -c '
from kubernetes import client
coreapi = client.CoreV1Api()
print(0)' &
done
wait
done"
This will run Python in a Docker container, install the Kubernetes python module, then run the test indefinitely. The test starts a simple application 50 times in order to increase the probability. This application loads the Kubernetes python module, create a CoreV1Api, which creates its ApiClient (with Async enabled using ThreadPool), then print 0
showing the freeze occured during the python exit sequence.
To stop the test:
docker rm -f testing
Expected:
This code should run indefinitely.
Result:
The loop hang on list of 0
after some time.
Workaround:
To avoid this issue, we override the ApiClient class to disable Async / ThreadPool feature. It seems to work without any issues so far. Downside is we are loosing the Async mode.
Thank you.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:9
- Comments:20 (4 by maintainers)
Top GitHub Comments
This seems to be related to the
__del__
method onApiClient
cleaning up itsThreadPool
: https://github.com/kubernetes-incubator/client-python/blob/30b8ee44f4d14546e651dead91306719d53f8c37/kubernetes/client/api_client.py#L76-L78This can cause a deadlock when the api clients are garbage collected as Python exits. I can reproduce with the following:
and running:
On macOS 10.12.6 with Python 3.6.3, after 1-50 executions, it will print “exiting…” and stall. The python process won’t ever terminate until you hit ctrl-c. I can also reproduce on Linux but it seems to be less frequent that on macOS.
This means a simple script like this:
may never terminate because
CoreV1Api
andBatchV1Api
will both instantiateApiClient
s which have the problematic__del__
method. We can reduce the likelihood of a deadlock by creating a singleApiClient
and passing it intoCoreV1Api
andBatchV1Api
but the problem doesn’t go away entirely. There are also some classes likeWatch
that always instantiate their ownApiClient
.I wonder whether the
multiprocessing.pool.ThreadPool
class is suitable for production use cases. According to a stackoverflow comment I came across:@furkanmustafa: You can’t reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.