question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DAG getting stuck in "running" state indefinitely

See original GitHub issue

Apache Airflow version: 2.0.2

Kubernetes version (if you are using kubernetes) (use kubectl version):

  • Cloud provider or hardware configuration:
  • OS : Ubuntu 18.04.3
  • Install tools: celery = 4.4.7, redis = 3.5.3

What happened: When I trigger manually my dag, some of the tasks are stuck in the “queued” state in the logs.

[2021-05-21 16:55:57,808: WARNING/ForkPoolWorker-9] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,080: WARNING/ForkPoolWorker-17] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,203: WARNING/ForkPoolWorker-13] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,221: WARNING/ForkPoolWorker-5] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,247: WARNING/ForkPoolWorker-4] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,296: WARNING/ForkPoolWorker-10] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,362: WARNING/ForkPoolWorker-1] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,367: WARNING/ForkPoolWorker-8] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,433: WARNING/ForkPoolWorker-3] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,445: WARNING/ForkPoolWorker-11] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,458: WARNING/ForkPoolWorker-6] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,459: WARNING/ForkPoolWorker-2] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******
[2021-05-21 16:55:58,510: WARNING/ForkPoolWorker-12] Running <TaskInstance: ******* 2021-05-21T08:54:59.100511+00:00 [queued]> on host *******

Even when I mark them as “failed” and rerun them again it is still getting stuck. When I check on the airflow UI the dag is in the “running” state : image

And when I check the subdags some of them are in the “running” (but nothing is happening) and “scheduled” state : image

I made sure to set all the other running tasks to “failed” before running this dag.

What you expected to happen: I expect all my tasks to be run and my dag to be marked as “success” or “failed” if there is an issue.

How to reproduce it: It occures when I run the following command : airflow celery worker. It doesnt occure everytime, sometimes the dags are not running indefinitely and it works well. I restarted few times airflow webserver, worker and scheduler but it didn’t change anything.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:20 (6 by maintainers)

github_iconTop GitHub Comments

5reactions
robin-clscommented, May 28, 2021

+1 for this issue, I am following how this unfolds

1reaction
EtsuNDmAcommented, Jul 19, 2021

The same issue with SubDagOperator in 2.1.2

Read more comments on GitHub >

github_iconTop Results From Across the Web

Example DAG gets stuck in "running" state indefinitely
What this means is that they are waiting to be picked up by airflow scheduler . If airflow scheduler is not running, you'll...
Read more >
Airflow DAG job in running state but idle for long time
Therefore one has to wait for "visibility timeout" (default to 6 hours in airflow) before task foo gets reassigned to another worker.
Read more >
Example DAG gets stuck in “running” state indefinitely - iTecNote
The dag state remains "running" for a long time (at least 20 minutes by now), although from a quick inspection of this task...
Read more >
7 Common Errors to Check When Debugging Airflow DAGs
1. Your DAG Isn't Running at the Expected Time · Airflow's Schedule Interval · Use Timetables for Simpler Scheduling · Airflow Time Zones....
Read more >
Airflow Task Dont Start But Stuck On "Running" - ADocLib
If the runtime of the last successful or failed task is greater than the frequency of the DAG, then DAG/tasks are stuck for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found