question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does the Webserver WORKER TIMEOUT

See original GitHub issue
Apache Airflow version:2.0.2
Kubernetes version:
	Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.4", GitCommit:"224be7bdce5a9dd0c2fd0d46b83865648e2fe0ba", GitTreeState:"clean", BuildDate:"2019-12-11T12:47:40Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
	Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.4", GitCommit:"224be7bdce5a9dd0c2fd0d46b83865648e2fe0ba", GitTreeState:"clean", BuildDate:"2019-12-11T12:37:43Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"} 
gunicorn 19.10.0

Output error log when the webserver pod is started:

[2021-06-17 09:35:00,378] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data.
[2021-06-17 09:35:00,418] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:01,963] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2021-06-17 09:35:02,052] {dagbag.py:451} INFO - Filling up the DagBag from /dev/null
[2021-06-17 09:35:03,317] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:03,389] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:13,116] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data.
[2021-06-17 09:35:13,118] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data.
[2021-06-17 09:35:13,128] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:13,132] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:13,132] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data.
[2021-06-17 09:35:13,194] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:13,403] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data.
[2021-06-17 09:35:13,420] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:14,502] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:14,591] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:14,708] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:14,941] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:17,614] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb'
[2021-06-17 09:35:17,724] {providers_manager.py:299} WARNING - Exception when importing 'airflow.providers.google.common.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package: No module named 'airflow.providers.google.common.hooks.leveldb' 
.............
$ cat gunicorn_error.log
[2021-06-17 09:35:05 +0000] [58] [INFO] Starting gunicorn 19.10.0
[2021-06-17 09:35:05 +0000] [58] [INFO] Listening at: http://0.0.0.0:8080 (58)
[2021-06-17 09:35:05 +0000] [58] [INFO] Using worker: sync
[2021-06-17 09:35:05 +0000] [64] [INFO] Booting worker with pid: 64
[2021-06-17 09:35:05 +0000] [65] [INFO] Booting worker with pid: 65
[2021-06-17 09:35:05 +0000] [66] [INFO] Booting worker with pid: 66
[2021-06-17 09:35:05 +0000] [67] [INFO] Booting worker with pid: 67
[2021-06-17 09:35:35 +0000] [58] [INFO] Handling signal: ttin
[2021-06-17 09:35:35 +0000] [204] [INFO] Booting worker with pid: 204
[2021-06-17 09:35:42 +0000] [58] [INFO] Handling signal: ttou
[2021-06-17 09:40:22 +0000] [58] [CRITICAL] WORKER TIMEOUT (pid:64)
[2021-06-17 09:40:22 +0000] [58] [CRITICAL] WORKER TIMEOUT (pid:65)
[2021-06-17 09:40:22 +0000] [58] [CRITICAL] WORKER TIMEOUT (pid:67)
[2021-06-17 09:40:22 +0000] [65] [INFO] Worker exiting (pid: 65)
[2021-06-17 09:40:22 +0000] [64] [INFO] Worker exiting (pid: 64)
[2021-06-17 09:40:22 +0000] [67] [INFO] Worker exiting (pid: 67)
[2021-06-17 09:40:23 +0000] [250] [INFO] Booting worker with pid: 250
[2021-06-17 09:40:23 +0000] [251] [INFO] Booting worker with pid: 251
[2021-06-17 09:40:25 +0000] [58] [CRITICAL] WORKER TIMEOUT (pid:66)
[2021-06-17 09:40:25 +0000] [66] [INFO] Worker exiting (pid: 66)
[2021-06-17 09:40:26 +0000] [316] [INFO] Booting worker with pid: 316
[2021-06-17 09:40:36 +0000] [58] [INFO] Handling signal: ttin
[2021-06-17 09:40:36 +0000] [355] [INFO] Booting worker with pid: 355 

airflow.cfg:

  • web_server_master_timeout -> 300
  • web_server_worker_timeout -> 300

I generated the image using Airflow2.0.2 Dockerfile as the base image:

My Yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  creationTimestamp: null
  name: ops-airflow-docker-151358
  labels:
    name: airflow-151358
    runEnv: live
    io.kompose.service: airflow-151358
spec:
  replicas: 1
  revisionHistoryLimit: 3
  minReadySeconds: 10
  strategy:
    type: Recreate
  selector:
    matchLabels:
      name: airflow-151358
  template:
    metadata:
      creationTimestamp: null
      labels:
        name: airflow-151358
        runEnv: live
        io.kompose.service: airflow-151358
    spec:
      dnsPolicy: Default
      hostNetwork: false
      restartPolicy: Always
      volumes:
      - name: host-log-dir
        hostPath:
          path: /docker-logs/applogs/ops-airflow-docker
      - name: webserver-claim0
        hostPath:
          path: /home/airflow/dags
      - name: webserver-claim1
        hostPath:
          path: /home/airflow/logs
      - name: scheduler-claim0
        hostPath:
          path: /home/airflow/dags
      - name: scheduler-claim1
        hostPath:
          path: /home/airflow/logs
      - name: worker-claim0
        hostPath:
          path: /home/airflow/dags
      - name: worker-claim1
        hostPath:
          path: /home/airflow/logs
      containers:
      - name: webserver
        ports:
        - containerPort: 8080
        env:
        - name: DETONATOR_LOG_ON_CONSOLE
          value: "false"
        - name: DETONATOR_SIG_TERM_DELAY
          value: "10"
        - name: DETONATOR_COLORED_CONSOLE_OUTPUT
          value: "false"
        - name: DETONATOR_START_TIMEOUT
          value: "600"
        - name: DY_APP_LOG_DIR
          value: /home/www/logs/applogs/ops-airflow-docker
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: LOAD_EX
          value: '"y"'
        - name: FERNET_KEY
          value: aKSXskUMeVdRpOBOVsxWUFixZkRgMMCq6Z6_fdae6yo=
        - name: EXECUTOR
          value: Celery
        - name: containerPort
          value: "8080"
        resources:
          limits:
            cpu: "2"
            memory: 3Gi
          requests:
            cpu: "1"
            memory: 2Gi
        image: *******/ops-airflow-docker:20210617162117-master-e334b7b1
        volumeMounts:
        - name: host-log-dir
          mountPath: /home/www/logs/applogs/ops-airflow-docker
        - name: webserver-claim0
          mountPath: /opt/airflow/dags
        - name: webserver-claim1
          mountPath: /opt/airflow/logs
        args:
        - webserver
      - name: flower
        ports:
        - containerPort: 5555
        env:
        - name: DETONATOR_LOG_ON_CONSOLE
          value: "false"
        - name: DETONATOR_SIG_TERM_DELAY
          value: "10"
        - name: DETONATOR_COLORED_CONSOLE_OUTPUT
          value: "false"
        - name: DETONATOR_START_TIMEOUT
          value: "600"
        - name: DY_APP_LOG_DIR
          value: /home/www/logs/applogs/ops-airflow-docker
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: EXECUTOR
          value: Celery
        - name: FLOWER_PORT
          value: "5555"
        - name: containerPort
          value: "5555"
        resources:
          limits:
            cpu: "2"
            memory: 2Gi
          requests:
            cpu: "1"
            memory: 1Gi
        image: ***/ops-airflow-docker:20210617162117-master-e334b7b1
        volumeMounts:
        - name: host-log-dir
          mountPath: /home/www/logs/applogs/ops-airflow-docker
        args:
        - celery
        - flower
      - name: scheduler
        env:
        - name: DETONATOR_LOG_ON_CONSOLE
          value: "false"
        - name: DETONATOR_SIG_TERM_DELAY
          value: "10"
        - name: DETONATOR_COLORED_CONSOLE_OUTPUT
          value: "false"
        - name: DETONATOR_START_TIMEOUT
          value: "600"
        - name: DY_APP_LOG_DIR
          value: /home/www/logs/applogs/ops-airflow-docker
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: LOAD_EX
          value: '"y"'
        - name: FERNET_KEY
          value: aKSXskUMeVdRpOBOVsxWUFixZkRgMMCq6Z6_fdae6yo=
        - name: EXECUTOR
          value: Celery
        resources:
          limits:
            cpu: "2"
            memory: 2Gi
          requests:
            cpu: "1"
            memory: 1Gi
        image: ***/ops-airflow-docker:20210617162117-master-e334b7b1
        volumeMounts:
        - name: host-log-dir
          mountPath: /home/www/logs/applogs/ops-airflow-docker
        - name: scheduler-claim0
          mountPath: /opt/airflow/dags
        - name: scheduler-claim1
          mountPath: /opt/airflow/logs
        args:
        - scheduler
      - name: worker
        env:
        - name: DETONATOR_LOG_ON_CONSOLE
          value: "false"
        - name: DETONATOR_SIG_TERM_DELAY
          value: "10"
        - name: DETONATOR_COLORED_CONSOLE_OUTPUT
          value: "false"
        - name: DETONATOR_START_TIMEOUT
          value: "600"
        - name: DY_APP_LOG_DIR
          value: /home/www/logs/applogs/ops-airflow-docker
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: FERNET_KEY
          value: aKSXskUMeVdRpOBOVsxWUFixZkRgMMCq6Z6_fdae6yo=
        - name: EXECUTOR
          value: Celery
        resources:
          limits:
            cpu: "4"
            memory: 4Gi
          requests:
            cpu: "1"
            memory: 2Gi
        image: ***/ops-airflow-docker:20210617162117-master-e334b7b1
        volumeMounts:
        - name: host-log-dir
          mountPath: /home/www/logs/applogs/ops-airflow-docker
        - name: worker-claim0
          mountPath: /opt/airflow/dags
        - name: worker-claim1
          mountPath: /opt/airflow/logs
        args:
        - celery
        - worker
      imagePullSecrets:
      - name: dik-renge
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: name
                  operator: In
                  values:
                  - airflow-151358
              topologyKey: kubernetes.io/hostname
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: name
                operator: In
                values:
                - airflow-151358
            topologyKey: kubernetes.io/hostname
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: devops
                operator: In
                values:
                - enable
status: {}

I think it also caused me to access WebServer very slowly

I would like to know what causes the following error and how to solve it。

Thank you very much

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
LChloecommented, Jun 21, 2021

@LChloe do you have installed all dependencies on webserver node? From logs it’s visible that there are multiple missing. Is there anything in .err file generated by webserver?

thank you for your reply. I see the following two problems with the logs being launched:

  1. INFO�[0m - This service is instrumented using OpenTelemetry.OpenTelemetry could not be imported; pleaseadd opentelemetry-api and opentelemetry-instrumentationpackages in order to get BigQuery Tracing data
  2. WARNING�[0m - Exception when importing ‘airflow.providers.google.common.hooks.leveldb.LevelDBHook’ from ‘apache-airflow-providers-google’ package: No module named 'airflow.providers.google.common.hooks.leveldb

Question 1 is due to the version of Google-Cloud-BigQuery is 1.28.0. #13131 Question 2. I think he is not influential.#15451

Temporarily did not see some other error log output I will try to modify the Google-Cloud-BigQuery version upgrade to 2.6.2 and update dependence on LevelDB

0reactions
LChloecommented, Jun 30, 2021

I think you need to look in your Kubernetes logs. My guess is that some resources are missing and your gunicorn workers cannot start - but this is likely not an airflow problem, but a problem connected with your deployment. Examine your K8S logs. I heartily recommend K9S tool for that. it helps to analyse all-things-kubernetes, you will be able to easily browse logs generated by different part of the system and see the errors generated by kubernetes.

Thank you for your suggestion, I think so, because using Docker Compose is normal. But I didn’t see the K8S error log, I will try to use the K9S Tool you recommend.

Read more comments on GitHub >

github_iconTop Results From Across the Web

airflow gunircorn [CRITICAL] WORKER TIMEOUT
I am using env variables to set executor, Postgres and Redis info to the webserver. I am using CMD-SHELL [ -f /home/airflow/airflow/airflow- ...
Read more >
How to resolve the gunicorn critical worker timeout error?
If above fix doesn't work, then increase Gunicorn timeout flag in Gunicorn configuration, default Gunicorn timeout is 30 seconds. --timeout 90.
Read more >
apache/incubator-airflow - Gitter
Does anyone know what it means when the webserver's gunicorn workers keep timing ... [2017-07-06 10:42:06 +0000] [112] [CRITICAL] WORKER TIMEOUT (pid:123) ...
Read more >
Gunicorn timing out web/server workers - Redash Discourse
The main gunicorn process will kill the worker process if it does not complete startup in 30 seconds and report back to the...
Read more >
Why am I getting H12 request timeouts when using Unicorn ...
The below explanation applies to both Unicorn and Gunicorn web servers. Your Unicorn master worker is timing out and killing web workers. With...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found