question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can a dag run depend on past?

See original GitHub issue

I would like to have entire dag runs depend on previous dag runs.

Use case 1: I have turned off a dag and when I turn it back on I would like to prioritize earlier runs completing over later runs starting. In my dag below its important that when runs are backed up we finish aggregate_db_message_job as soon as possible

screen shot 2015-12-15 at 6 47 04 pm

Use case 2: Some tasks in a dag depend on previous instances of a different task. In this Dag I want clear_spark_logs after the send_email_notification_flow_successful of the previous run

screen shot 2015-12-15 at 6 48 51 pm

As I understand Airflow, setting depends_on_past to True for a Dag just sets the depends_on_past parameter for each task in that dag. Using priority weights also doesn’t seem to solve the problem since the task pool might be large enough to start tasks in the next day.

What’s the proper way to solve this with airflow?

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
r39132commented, Dec 16, 2015

Cool. Didn’t think about using it within the same DAG. Thx. Will try that out. I am running hourly, so I could set the execution delta to hours=1. Is the execution delta a minimum requirement or a maximum requirement? In other words, does at least 1 hour difference need to occur for the trigger to fire? Or does the trigger fire up until a max difference of 1 hour occurs?

0reactions
wil5forcommented, Dec 21, 2015

The external task sensor with an external_dag_id of the same dag worked well. Thanks for pointing out that operator!

Example:

wait_for_task_in_previous_hour = ExternalTaskSensor(
    task_id='wait_for_task_in_previous_hour',
    external_dag_id='my_dag_id',
    external_task_id='task_to_wait_for_id',
    allowed_states=['success'],
    execution_delta=timedelta(hours=1),
    dag=dag)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow depends_on_past for whole DAG - Stack Overflow
At your first task, set depends_on_past=True and wait_for_downstream=True , the combination will result in that current dag-run runs only if ...
Read more >
Airflow: wait for previous dag run to complete
I want to set up a dag there are few cases that I would like to address while creating the dag. Next run...
Read more >
Dependencies between DAGs in Apache Airflow
This sensor will lookup past executions of DAGs and tasks, and will match those DAGs that share the same execution_date as our DAG....
Read more >
5 Things I wish I knew about Apache Airflow | by Michal Mrázek
depends_on_past — if set to True , it causes a task instance to depend on the success of its previous task instance. depends_on_past...
Read more >
Setting --ignore-first-depends-on-past to True #20856 - GitHub
This may seem like a silly question, but when a DAG is scheduled and a dagrun is created, is the run function actually...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found