question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New grid view in Airflow 2.3.0 has very slow performance on large DAGs relative to tree view in 2.2.5

See original GitHub issue

Apache Airflow version

2.3.0 (latest released)

What happened

I upgraded a local dev deployment of Airflow from 2.2.5 to 2.3.0, then loaded the new /dags/<dag_id>/grid page for a few dag ids.

On a big DAG, I’m seeing 30+ second latency on the /grid API, followed by a 10+ second delay each time I click a green rectangle. For a smaller DAG I tried, the page was pretty snappy.

I went back to 2.2.5 and loaded the tree view for comparison, and saw that the /tree/ endpoint on the large DAG had 9 seconds of latency, and clicking a green rectangle had instant responsiveness.

This is slow enough that it would be a blocker for my team to upgrade.

What you think should happen instead

The grid view should be equally performant to the tree view it replaces

How to reproduce

Generate a large DAG. Mine looks like the following:

  • 900 tasks
  • 150 task groups
  • 25 historical runs

Compare against a small DAG, in my case:

  • 200 tasks
  • 36 task groups
  • 25 historical runs

The large DAG is unusable, the small DAG is usable.

Operating System

Ubuntu 20.04.3 LTS (Focal Fossa)

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

Docker-compose deployment on an EC2 instance running ubuntu. Airflow web server is nearly stock image from apache/airflow:2.3.0-python3.9

Anything else

Screenshot of load time: image

GIF of click latency: 2022-05-17 21 26 26

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

3reactions
bbovenzicommented, May 25, 2022

Yes, that change was just for dynamic tasks. I am working on more optimizations. They just didn’t make it in time for 2.3.1.

2reactions
bbovenzicommented, May 26, 2022

Going to reopen as we can still do more to improve perfomance for large DAGs

Read more comments on GitHub >

github_iconTop Results From Across the Web

Everything You Should Know About Airflow 2.3's New Grid View
Airflow 2.3's new grid view is a compact, intuitive way to visualize complex representations in Airflow's UI. Today, we present a detailed ...
Read more >
Release Notes — Airflow Documentation
New to this release of Airflow is the concept of Datasets to Airflow, and with it a new way of scheduling dags: data-aware...
Read more >
Cloud Composer release notes | Google Cloud
The apache-airflow-providers-google package in images with Airflow 2.1.4 and 2.2.5 was upgraded to 2022.10.17+composer . Changes compared to version ...
Read more >
apache-airflow Changelog - pyup.io
Update SLA wording to reflect it is relative to ``Dag Run`` start. (27111) ... Fix RecursionError on graph view of a DAG with...
Read more >
apache-airflow - PyPI
Airflow works best with workflows that are mostly static and slowly changing. When the DAG structure is similar from one run to the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found