question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Scheduler Memory Leak in Airflow 2.0.1

See original GitHub issue

Apache Airflow version: 2.0.1

Kubernetes version (if you are using kubernetes) (use kubectl version): v1.17.4

Environment: Dev

  • OS (e.g. from /etc/os-release): RHEL7

What happened:

After running fine for some time my airflow tasks got stuck in scheduled state with below error in Task Instance Details: “All dependencies are met but the task instance is not running. In most cases this just means that the task will probably be scheduled soon unless: - The scheduler is down or under heavy load If this task instance does not start soon please contact your Airflow administrator for assistance.”

What you expected to happen:

I restarted the scheduler then it started working fine. When i checked my metrics i realized the scheduler has a memory leak and over past 4 days it has reached up to 6GB of memory utilization

In version >2.0 we don’t even have the run_duration config option to restart scheduler periodically to avoid this issue until it is resolved.

How to reproduce it: I saw this issue in multiple dev instances of mine all running Airflow 2.0.1 on kubernetes with KubernetesExecutor. Below are the configs that i changed from the default config. max_active_dag_runs_per_dag=32 parallelism=64 dag_concurrency=32 sql_Alchemy_pool_size=50 sql_Alchemy_max_overflow=30

Anything else we need to know:

The scheduler memory leaks occurs consistently in all instances i have been running. The memory utilization keeps growing for scheduler.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:11
  • Comments:67 (46 by maintainers)

github_iconTop GitHub Comments

3reactions
suhanovvcommented, Sep 8, 2021

@potiuk the last fix works as it should

image

2reactions
uranusjrcommented, Mar 24, 2021

Please keep this thread on topic with the scheduler memory issue. For usage questions, please open threads in Discussions instead.

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [airflow] suxin1995 commented on issue #14924 ...
[GitHub] [airflow] suxin1995 commented on issue #14924: Scheduler Memory Leak in Airflow 2.0.1 · GitBox Wed, 31 Mar 2021 02:41:51 -0700.
Read more >
[GitHub] [airflow] itispankajsingh opened a new issue #14924
It seems that the memory leak problem has not been solved well. ... got similar issue with Airflow 2.0.1 when using Kubernetes executor....
Read more >
Airflow Scheduler out of memory problems - Stack Overflow
The parallelism setting will directly limit how many task are running simultaneously across all dag runs/tasks, which would have the most ...
Read more >
7 Common Errors to Check When Debugging Airflow DAGs
If your Deployment is in this state, your Webserver might be hitting a memory limit when loading your DAGs even as your Scheduler...
Read more >
Lessons Learned from Airflow 2.0 - Tenzo Blog
Lessons learnt from implementing Airflow 2.0 for our task management for ETL ... A DAG's Landing Times chart shows the delay from task...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found