Can a dag run depend on past?
See original GitHub issueI would like to have entire dag runs depend on previous dag runs.
Use case 1: I have turned off a dag and when I turn it back on I would like to prioritize earlier runs completing over later runs starting. In my dag below its important that when runs are backed up we finish aggregate_db_message_job
as soon as possible
Use case 2: Some tasks in a dag depend on previous instances of a different task. In this Dag I want clear_spark_logs
after the send_email_notification_flow_successful
of the previous run
As I understand Airflow, setting depends_on_past to True for a Dag just sets the depends_on_past parameter for each task in that dag. Using priority weights also doesn’t seem to solve the problem since the task pool might be large enough to start tasks in the next day.
What’s the proper way to solve this with airflow?
Issue Analytics
- State:
- Created 8 years ago
- Comments:6 (5 by maintainers)
Top GitHub Comments
Cool. Didn’t think about using it within the same DAG. Thx. Will try that out. I am running hourly, so I could set the execution delta to
hours=1
. Is the execution delta a minimum requirement or a maximum requirement? In other words, does at least 1 hour difference need to occur for the trigger to fire? Or does the trigger fire up until a max difference of 1 hour occurs?The external task sensor with an external_dag_id of the same dag worked well. Thanks for pointing out that operator!
Example: