[PREMIUM] Functions Host causes Container Exit/Restart
See original GitHub issueWe have a Premium Python Durable Functions app that occasionally “hangs” for a very long time in between scheduled tasks.
We’ve investigated the issue quite a bit and discovered that the hanging is due to our container exiting.
We regularly test the resource usage of the Function App in KUDU by using the top command. While in KUDU we notice this message in the bottom tray when we experience the long hang:
So the container has evidently exited or restarted. When we run the “Container Crash” report in the “Diagnose and Solve problems” tab of our function app, we can see the following error message about the container exit:
Container exited unexpectedly: last 10 seconds logs [2020-12-15T17:29:01.710719291Z /azure-functions-host/start.sh: line 28: 18 Killed
The most notable portion is “18 Killed”
azure-functions-host/start.sh: line 28: 18 Killed
After more investigation it seems like “18” is referring to Process ID 18 which is the Microsoft+ process on our container, which I’m guessing is the functions host:

For some reason the functions host is killing this process, and then the container restarts immediately after - so we believe these are related.
Why does the functions host kill this process, and is this the cause for the container restarting? Is there anything we can do to fix this issue?
Investigative information
Please provide the following:
- Timestamp: 2020-12-15T17:29:01.710719291Z
- Function App version: ~3
- Functions Host Version: 3.0.15149.0
- Function App name: labelright-test-v2
- Function name(s): can happen during any function invocation
- Invocation ID: not related to any particular function invocation
- Region: East US
Repro steps
It seems to occur randomly, either under heavy or light load it can happen.
The only way to know its happening is to be watching KUDU during regular load, or run the “Container Crash” report in the function app panel to see when it last occurred.
Expected behavior
The functions host does not kill the Microsoft+ process and does not cause a container restart.
Actual behavior
The functions host kills the Microsoft+ process and causes a container restart.
Related information
- Premium Function App Plan EP2 (2 vCPU, 7GB RAM)
- Only scaling out to one instance on this plan
- Using Python 3.6
- Using a custom docker image, inherits from the Azure python-3.6-appservice image
- Only additions to our container are installing Java and some libraries for working with PDFs
- This is a Durable Functions app that has a mix of I/O and CPU bound tasks
- The application converts PDFs to high quality images, uses tensorflow models for object detection, and uses OpenCV for computer vision and image analysis tasks
- FUNCTIONS_WORKER_PROCESS_COUNT = 8 (issue occurs with this setting at 2 as well)
Tagging @davidmrdavid from a previous conversation we had on the Durable Functions repo about the same topic:
https://github.com/Azure/azure-functions-durable-extension/issues/1573#issuecomment-730568055
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:6 (3 by maintainers)
Hey @davidmrdavid
Thanks for the follow up.
This past week I did a lot of experimenting and found the best settings for our app. I think we could close the issue, though I would like to write up the experience here so others with similar issues can search it up.
There were two specific bugs I was trying to fix:
The first issue I resolved in two steps. First, I found which parts of our app consumed the most memory and tried to reduce memory consumption as best as possible. Then, I was still experiencing language workers being killed at 8 workers so I started reducing the count. At 2 language workers, we were unable to produce the first issue above. I found it really helpful to force our function app scale to a single VM, and use KUDU to SSH in and the “top” command to view the language workers and how much memory they were using. Being on a single VM also helped ensure all traffic was going to the same pool of resources, and simulated the “worse case scenario” of heavy work going to one VM, even if you can scale out to many.
In regard to the container restarting, I think out-of-memory could be part of the issue, but this seemed to happen even under very light load. I was able to reproduce the container restart issue using a low-memory workflow with 8, 6, and even 4 language workers. The issue wasn’t totally resolved until we dropped language workers to 2. This is the same number of vCPU on our EP2 service plan. So while memory could be part of the issue, I think it’s possible that the system experiences CPU process locks if you use more language workers than you have vCPUs on your service plan. If you’re using Node or Python Azure Functions, I wouldn’t recommend using more language workers than you have vCPUs in your service plan (if you’re on premium) - even if your workload is low on resource usage.
I’m glad to hear about the PYTHON_THREADPOOL_THREAD_COUNT, and can’t wait to see the docs on this. Does this setting help a single language worker execute multiple function invocations in parallel, or does it just give a single function invocation access to more threads to work on?
PS: Even though we’ve had to work out a lot of issues to get things running just right, the application is very performant now and we’re loving Azure Functions. When we move to production, we’re thinking of running our function app in Kubernetes to expand beyond the VM sizes offered in the Elastic Premium plan and scale to more VMs.
Great work, great product!
Thanks David!
Marc DeMory Emerging Technology Consultant Accenture Liquid Studio - Chicagohttps://in.accenture.com/liquidstudios/chicagoliquidstudios/ m: 630-244-9625 @.***
From: David Justo @.> Sent: Thursday, March 11, 2021 1:02 PM To: Azure/azure-functions-host @.> Cc: DeMory, Marc @.>; Mention @.> Subject: [External] Re: [Azure/azure-functions-host] [PREMIUM] Functions Host causes Container Exit/Restart (#6985)
This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links and attachments.
Closing this issue as it was resolved!
@marcd123https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_marcd123&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=BbThQ-iApsLvCBycuzvhkO8FJNgUkO6tTOl2tJGWpHk&m=MqDA26oNjO8DUewvFmarqGTHJFy6DKFnZE15mMN7p2A&s=j047zT98LlRh4oUXyjCe-K6lMWso7kGnR33iF82nk-M&e=, the new performance docs are available here: https://docs.microsoft.com/en-us/azure/azure-functions/python-scale-performance-referencehttps://urldefense.proofpoint.com/v2/url?u=https-3A__docs.microsoft.com_en-2Dus_azure_azure-2Dfunctions_python-2Dscale-2Dperformance-2Dreference&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=BbThQ-iApsLvCBycuzvhkO8FJNgUkO6tTOl2tJGWpHk&m=MqDA26oNjO8DUewvFmarqGTHJFy6DKFnZE15mMN7p2A&s=kBI4C4db8rN4vOZ-80oRVX23BBe3LI48Za7LejF7xPw&e=
Reach out again if you need anything! ⚡ ⚡
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Azure_azure-2Dfunctions-2Dhost_issues_6985-23issuecomment-2D796969693&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=BbThQ-iApsLvCBycuzvhkO8FJNgUkO6tTOl2tJGWpHk&m=MqDA26oNjO8DUewvFmarqGTHJFy6DKFnZE15mMN7p2A&s=PvPo9TTaO-uiKI3UVDkh-xqJsJ9Yyhu2njCYYR8O-sw&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACIWIOHXHV2FIMIEBGOBSOTTDEATDANCNFSM4U43JUWA&d=DwMFaQ&c=eIGjsITfXP_y-DLLX0uEHXJvU8nOHrUK8IrwNKOtkVU&r=BbThQ-iApsLvCBycuzvhkO8FJNgUkO6tTOl2tJGWpHk&m=MqDA26oNjO8DUewvFmarqGTHJFy6DKFnZE15mMN7p2A&s=84oRTDxhyKAJwm8L65T6aBVJanzKrW_28NUA5Cncz0w&e=.
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. Your privacy is important to us. Accenture uses your personal data only in compliance with data protection laws. For further information on how Accenture processes your personal data, please see our privacy statement at https://www.accenture.com/us-en/privacy-policy.
www.accenture.com