Scheduler Memory Leak in Airflow 2.0.1
See original GitHub issueApache Airflow version: 2.0.1
Kubernetes version (if you are using kubernetes) (use kubectl version
): v1.17.4
Environment: Dev
- OS (e.g. from /etc/os-release): RHEL7
What happened:
After running fine for some time my airflow tasks got stuck in scheduled state with below error in Task Instance Details: “All dependencies are met but the task instance is not running. In most cases this just means that the task will probably be scheduled soon unless: - The scheduler is down or under heavy load If this task instance does not start soon please contact your Airflow administrator for assistance.”
What you expected to happen:
I restarted the scheduler then it started working fine. When i checked my metrics i realized the scheduler has a memory leak and over past 4 days it has reached up to 6GB of memory utilization
In version >2.0 we don’t even have the run_duration config option to restart scheduler periodically to avoid this issue until it is resolved.
How to reproduce it: I saw this issue in multiple dev instances of mine all running Airflow 2.0.1 on kubernetes with KubernetesExecutor. Below are the configs that i changed from the default config. max_active_dag_runs_per_dag=32 parallelism=64 dag_concurrency=32 sql_Alchemy_pool_size=50 sql_Alchemy_max_overflow=30
Anything else we need to know:
The scheduler memory leaks occurs consistently in all instances i have been running. The memory utilization keeps growing for scheduler.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:11
- Comments:67 (46 by maintainers)
Top GitHub Comments
@potiuk the last fix works as it should
Please keep this thread on topic with the scheduler memory issue. For usage questions, please open threads in Discussions instead.