question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

max_active_runs = 1 can still create multiple active execution runs

See original GitHub issue

Edit: There is a separate issue affecting max_active_runs in 1.10.14. That regression is fixed in 1.10.15.

Edit2: Version v2.1.3 contains some fixes but also contains bad regressions involving max_active_runs. Use v2.14 for the complete fixes to this issue

Edit3: Version 2.2.0 contains a fix for max_active_runs using dags trigger command or TriggerDagRunOperator. https://github.com/apache/airflow/issues/18583

Apache Airflow version: 1.10.11, localExecutor

What happened:

I have max_active_runs = 1 in my dag file (which consists of multiple tasks) and I manually triggered a dag. While it was running, a second execution began under its scheduled time while the first execution was running.

I should note that the second execution is initially queued. It’s only when the dag’s 1st execution moves to the next task that the second execution actually starts.

My dag definition. The dag just contains tasks using pythonOperator.

dag = DAG(
    'dag1',
    default_args=default_args,
    description='xyz',
    schedule_interval=timedelta(hours=1),
    catchup=False,
    max_active_runs=1
)

What you expected to happen:

Only one execution should run. A second execution should be queued but not begin executing.

How to reproduce it: In my scenario:

  1. Manually trigger dag with multiple tasks… have task1 take longer than the beginning of the next scheduled execution. (Dag Execution1). As an example, if the scheduled interval is 1 hour, have task1 take longer than 1 hour so as to queue up the second execution (Execution2).
  2. When task1 of Execution1 finishes and just before starting task2, the second execution (Execution2, which is already queued) begins running.

image

Anything else we need to know: I think the second execution begins in between the task1 and task2 of execution1. I think there’s a few second delay there and maybe that’s when Airflow thinks there’s no dag execution? That’s just a guess.

Btw, this can have potentially disastrous effects (errors, incomplete data without errors, etc)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:34
  • Comments:70 (41 by maintainers)

github_iconTop GitHub Comments

7reactions
mik-lajcommented, Aug 20, 2020

The problem is, we don’t have a state that describes DAG Run that are saved but not running. All DAG Run have running state initially. If we want to fix this bug we have to add a new dag state.

6reactions
fj-sanchezcommented, Jun 9, 2021

Yeah, this is actually important for us also, it would be great to get this fixed ASAP. Also, currently it’s the bug with the biggest amount of 👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

FAQ — Airflow Documentation - Apache Airflow
Are the DagRuns you need created and active? A DagRun represents a specific execution of an entire DAG and has a state (running,...
Read more >
How to limit Airflow to run only one instance of a DAG run at a ...
I want the tasks in the DAG to all finish before the 1st task of the next run gets executed.
Read more >
DAG next run in past date, why? - Airflow - Astronomer Forum
I have created a dag scheduled to run each day at 21. I set the properties in this way: schedule_interval="0 21 * *...
Read more >
Airflow Parallelism 101: A Comprehensive Guide - Learn | Hevo
If you're new to Apache Airflow, the world of Executors can be ... number of active DAG Runs (per DAG) that the Airflow...
Read more >
Airflow: In a Time where Timing is of the Essence — Part II
dagrun_timeout — it sets the execution timeout of a DAG Run. Be careful with this one, because Airflow will run all Tasks needed...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found