question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Airflow web UI is slow

See original GitHub issue

Apache Airflow version:

1.10.10

Kubernetes version (if you are using kubernetes) (use kubectl version):

1.13.12

Environment:

  • Cloud provider or hardware configuration: Azure
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened:

Every HTTP requests of the UI takes at least 5s, even static content.

Airflow_-_DAGs

/admin/metrics/, /health endpoints and 404 page have the same problems.

Here a graphs showing CPU usage of all ariflow components:

k8s-pods-2-thanos_-_Grafana

Each container has a 1s limit (left Y axis) so none of them is currently CPU bound.

What you expected to happen:

How to reproduce it:

Anything else we need to know:

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:26 (6 by maintainers)

github_iconTop GitHub Comments

19reactions
danielnazareth89commented, Aug 28, 2020

I fixed this by changing the default worker gunicorn from sync to an asynchronous class namely gevent. Please see the below AWS thread for the weird behaviour between gunicorn and ELB’s.

https://forums.aws.amazon.com/thread.jspa?messageID=419138

So, simply set AIRFLOW__WEBSERVER__WORKER_CLASS: “gevent” in your config, should be better

10reactions
natemosemancommented, Aug 26, 2020

I had a similar issue with airflow running on kubernetes cluster that I was, fortunately, able to solve.

Regular HTTP connections were taking a minimal of 5 seconds to complete. Even using curl to fetch static content, like a CSS file, was taking 5 seconds. When looking at the logs of the airflow web process it didn’t show anything. Although following them ‘realtime’ showed that the GET was showing up in the logs with the same 5 second delays as the curl command was taking.

by-passing everything and running curl directly from the container, and thus by-passing all the kubernetes networking stuff, was still having the delays.

It took me a few days to realize what was going on with my setup.

As it turned out it was due to the ‘type: LoadBalancer’ service I was using to expose the airflow webserver to outside the cluster. The loadbalancer was a external network load balancer that connected to the service via NodePort on each virtual machine in the node. For whatever reason this meant that there was a large number of connections just kept open to the webserver at any time.

In a 20 node cluster this meant 20 connections.

So when I killed the LoadBalancer service and started using nginx-ingress instead then the problem instantly resolved itself. No more delays. Admin web UI went back to normal.

I am not exactly sure what was going on here. But I suspect that having a large number of connections always open was causing gunicorn process to delay routing new connections to the pool of webserver worker processes. I was only using 4 processes at the time.

So if you are seeing these strange 5 second delays then use netstat or similar tool to count the number of “ESTABLISHED” connections to the webserver process. If you have a lot of connections and you are using service ‘type: LoadBalancer’ then try switching to using a ingress controller. Also increasing the number of worker processes to exceed the number of established connections will probably work too.

Hope that helps.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow UI feels really slow : r/dataengineering - Reddit
When running airflow webserver, any request takes more than 5 seconds to load, and I don't know if that's normal or if something...
Read more >
Airflow UI loading extremely slow after upgrading version
I upgraded my airflow cluster from 1.7.1.3 to 1.10.1. After the upgradation the main page UI of airflow is loading very slowly.
Read more >
Airflow New Dag File Processing Slow - Astronomer Forum
When there is a new dag file created inside “dags” directory then Airflow takes more than 30 minutes to load new dag file...
Read more >
Performance tuning for Apache Airflow on Amazon MWAA
... for Apache Airflow (MWAA) environment using Airflow configuration options. ... and increase the time it takes for DAGs to appear in the...
Read more >
Why Is Airflow 1.10.12 Much Slower Than 1.10.10 - ADocLib
There are no "obvious" candidates for slow spots - "plateaus" where. ... The Airflow Scheduler, Web UI, and Worker will pick up the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found