DAG.add_task doesn't pick up start_date default arg
See original GitHub issueApache Airflow version: 2.1.0
What happened:
When attempting to set the dag for a task (Operator) without the dag previously set, where the dag’s start_date is set via default_args , an error is raised: AirflowException: Task is missing the start_date parameter
.
What you expected to happen: The start_date from default_args to be honored and no error to occur.
Related notes:
This relates to an issue in 1.x where using the bitshift operator with Operators that don’t specify a dag (or are not within a DAG’s context), and a dag where start_date is set via default_args
, the bitshift operation will error with: AirflowException: Task is missing the start_date parameter
. That issue was raised several times in the past (most recently in #7996), and @ashb had a solution in #5598 , but it was put on hold since support for bitshift operators was being debated for Airflow 2.x.
How to reproduce it:
from datetime import datetime
from airflow import DAG
from airflow.operators.dummy import DummyOperator
default_args = {
'owner': 'airflow',
'start_date': datetime(2017, 2, 1)
}
dag = DAG('my_dag', default_args=default_args)
dummy = DummyOperator(task_id='dummy')
dag.add_task(dummy)
# # Rest of the flow
# dummy2 = DummyOperator(task_id='dummy2')
# dag.add_task(dummy2)
# dummy >> dummy2
Setting start_date directly instead of from default_args works as expected:
...
dag = DAG('my_dag', start_date=datetime(2017, 2, 1), default_args=default_args)
...
While the example is contrived, I think it’s important functionality for reusable, pre-configured Operators, such as are used in factory and composition patterns.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:9 (6 by maintainers)
Top GitHub Comments
Personally I wish Airflow never had
default_args
, it’s too magical with a logical native Python approach (**kwargs
). But I guess that ship sailed way back and we should make it work mroe closely to what users expect in this case.Using the DAG code as shared by the author will show broken dag message
However if using the code as:
It will work
In our examples we never show usage of
dag.add_task()
and I’m not sure why this is needed here? Noting also Ash comment to set thestart_date
on the DAG level. Since the recommended way to use context manager and this is working fine I’m closing this issue.