question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DAG.add_task doesn't pick up start_date default arg

See original GitHub issue

Apache Airflow version: 2.1.0

What happened: When attempting to set the dag for a task (Operator) without the dag previously set, where the dag’s start_date is set via default_args , an error is raised: AirflowException: Task is missing the start_date parameter.

What you expected to happen: The start_date from default_args to be honored and no error to occur.

Related notes: This relates to an issue in 1.x where using the bitshift operator with Operators that don’t specify a dag (or are not within a DAG’s context), and a dag where start_date is set via default_args, the bitshift operation will error with: AirflowException: Task is missing the start_date parameter. That issue was raised several times in the past (most recently in #7996), and @ashb had a solution in #5598 , but it was put on hold since support for bitshift operators was being debated for Airflow 2.x.

How to reproduce it:

from datetime import datetime

from airflow import DAG
from airflow.operators.dummy import DummyOperator

default_args = {
    'owner': 'airflow',
    'start_date': datetime(2017, 2, 1)
}

dag = DAG('my_dag', default_args=default_args)
dummy = DummyOperator(task_id='dummy')
dag.add_task(dummy)
# # Rest of the flow
# dummy2 = DummyOperator(task_id='dummy2')
# dag.add_task(dummy2)

# dummy >> dummy2

Setting start_date directly instead of from default_args works as expected:

...
dag = DAG('my_dag', start_date=datetime(2017, 2, 1), default_args=default_args)
...

While the example is contrived, I think it’s important functionality for reusable, pre-configured Operators, such as are used in factory and composition patterns.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
uranusjrcommented, Jun 29, 2021

Personally I wish Airflow never had default_args, it’s too magical with a logical native Python approach (**kwargs). But I guess that ship sailed way back and we should make it work mroe closely to what users expect in this case.

0reactions
eladkalcommented, Nov 2, 2022

Using the DAG code as shared by the author will show broken dag message Screen Shot 2022-11-02 at 12 58 02

However if using the code as:

default_args = {
    'owner': 'airflow',
    'start_date': datetime(2017, 2, 1)
}
with DAG('my_dag', default_args=default_args):
    DummyOperator(task_id='dummy')

It will work Screen Shot 2022-11-02 at 13 01 16

In our examples we never show usage of dag.add_task() and I’m not sure why this is needed here? Noting also Ash comment to set the start_date on the DAG level. Since the recommended way to use context manager and this is working fine I’m closing this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

dag.py raises: "airflow.exceptions.AirflowException: Task is ...
That is because two of your tasks have not been assigned to the DAG which contains the start_date in default_args .
Read more >
Assigning operator to DAG via bitwise composition does not ...
I believe to fix this, on assignment, we would need to go back and go through dag.default_args to see if any of those...
Read more >
airflow.models.dag — Airflow Documentation
Returned dates can be used for execution dates. Parameters. start_date – The start date of the interval. end_date – The end date of...
Read more >
Concepts - Apache Airflow Documentation - Read the Docs
default_args=dict( start_date=datetime(2016, 1, 1), owner='Airflow') dag ... This worker will then only pick up tasks wired to the specified queue(s).
Read more >
API Reference — Airflow Documentation
Read the FAQ entry about start_date for more information. end_date (datetime) – if ... It is invoked when tasks are added to the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found