[Issue/Bug?] Function Timeout
See original GitHub issueHello ! We are currently using several AzureFunctions in production and have been facing daily timeouts since a few months.
Is your question related to a specific version? If so, please specify:
Function app details
- Deployment using AzureCLI.
- Docker runtime: mcr.microsoft.com/azure-functions/python:2.0.14494-python3.6-appservice
- (we use this specific version, as suggested by Azure technical support, because we encountered some bugs with newer ones.
- Operating system: Linux
- AppService plan: Premium, P2V2: 1
What binding does your question apply to, if any? (e.g. Blob Trigger, Event Hub Binding, etc)
The FunctionApp contains several functions:
- Some functions with HTTP Binding which receive data and push them to EventHub
- Some crons
- Some functions with Event Hub bindings, which receive data (from 1.) and push them to MongoDb/Slack/BlobStorage
This code base is deployed into two Function App, by disabling the appropriate functions, one (let’s call it A) for “1.” and another one (let’s call it B) for “2. and 3.”.
Question
For safety reason, we have a timeout of 2 minutes in host.json. Knowing that every function should run in less than 200ms
Since a few months, we are facing timeouts on a daily basis. Those timeouts mostly impact one of the Functions of FunctionApp “A”, the ones which receives nearly all of the trafic. But we also have a few timeouts on Function B. I did not find any obvious pattern / cpu or memory “excessive usage” …
The weird part is, when a timeout occurs, the logs who should be produced by the code of the function are not generated … which might indicate that our function is not executed for some reason.
Here is a simplified version of our code with extensive logging:
event_producer: EventProducer = EventProducer(...)
async def main(req: func.HttpRequest) -> func.HttpResponse:
data: dict = build_event_data(req)
await event_producer.produce(data)
return func.HttpResponse(status_code=200)
def build_event_data(req) -> dict:
logging.info("Entering main")
...
return ...
import asyncio
class EventProducer:
def __init__(self, password: str):
self._password: str = password
self.client = EventHubProducerClient.from_connection_string(self._password, logging_enable=True)
self.lock = asyncio.Lock()
async def produce(self, data: dict):
logging.info("Start BSON encode")
bson_encoded_data = bson.BSON.encode(data)
logging.info("Done BSON encode")
logging.info("Start B64 encode")
b64_encoded_data = base64.b64encode(bson_encoded_data)
logging.info("Done B64 encode")
logging.info("Start async with")
async with self.lock:
logging.info("In async with")
logging.info("Start send batch")
await self.client.send_batch([EventData(b64_encoded_data)])
logging.info("Done send batch")
logging.info("Done async with")
We have also been having issues with EventHubProducerClient and I am currently working with @yunhaoling on them here. We use to have a threading approach (see here) but we have move to async/await as advised by @yunhaoling.
Here a screenshot of the logs :
- Overall view of the executed functions: here
- All of the logs right before the timeout happens: here
- Logs of the FIRST execution which times out: here
From my understanding there are only two explanations:
- Something is broken in our code, it keeps running, and somehow prevent the execution of the function (is that even possible ?)
- Something is wrong with the host system
I have been investigating this for weeks now, doing dozen of experiments.
I believe that we had several issues underneath, those related to EventHubProducerClient which should be fixed (using a lock workaround) by now.
Could you please have a look at it and advise me on how to proceed ? If necessary, I can give you a temporary access to our code base.
Thanks !
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (2 by maintainers)

Top Related StackOverflow Question
Thanks a lot @stael for the detailed issue and bug report. I’ll work with the team to go through this issue and make our logging/documentation better in the short term to get the knowledge out. And work with the EH extension team to see if this can be fixed in future.
Thanks a lot again.
Hello @v-bbalaiagar,
here are the conclusion of the investigation & tests we did with @yunhaoling :
I believe that there should be a strong warning somewhere about using
EventHubProducerin an AzureFunction.Finally, I also found a bug: When you define an output, you must specify a eventHubName, which will be ignored in favor of the one provided in the connection. Example:
However, it seems that the
eventHubNamemust not be shared between functions using a distinct connection, otherwise, the data will be send to the wrong EventHub. Indeed, I have several functions sending data to EventHub using the configuration defined above. For each function, I set theeventHubNameto__MUST_EXISTS_BUT_WILL_BE_OVERWRITTEN_BY_CONNECTION__… for code’s clarity. And I ended up with data going to the wrong EventHub.However, I also have several functions receiving data from EventHub. And I also used
__MUST_EXISTS_BUT_WILL_BE_OVERWRITTEN_BY_CONNECTION__aseventHubNamebut on that case I did not encounter any issue.Could you please have a look at that / dispatch it to the proper team ? I believe that at least a clear warning should be added in the documentation