TaskFlow AirflowSkipException causes downstream step to fail
See original GitHub issueApache Airflow version
2.3.2 (latest released)
What happened
Using TaskFlow API and have 2 tasks that lead to the same downstream task. These tasks check for new data and when found will set an XCom entry of the new filename for the downstream to handle. If no data is found the upstream tasks raise a skip exception. The downstream task has the trigger_rule = none_failed_min_one_success.
Problem is that a task which is set to Skip doesn’t set any XCom. When the downstream task starts it raises the error:
airflow.exceptions.AirflowException: XComArg result from task2 at airflow_2_3_xcomarg_render_error with key="return_value" is not found!
What you think should happen instead
Based on trigger rule of “none_failed_min_one_success”, expectation is that an upstream task should be allowed to skip and the downstream task will still run. While the downstream does try to start based on trigger rules, it never really gets to run since the error is raised when rendering the arguments.
How to reproduce
Example dag will generate the error if run.
from airflow.decorators import dag, task
from airflow.exceptions import AirflowSkipException
@task
def task1():
return "example.csv"
@task
def task2():
raise AirflowSkipException()
@task(trigger_rule="none_failed_min_one_success")
def downstream_task(t1, t2):
print("task ran")
@dag(
default_args={"owner": "Airflow", "start_date": "2022-06-07"},
schedule_interval=None,
)
def airflow_2_3_xcomarg_render_error():
t1 = task1()
t2 = task2()
downstream_task(t1, t2)
example_dag = airflow_2_3_xcomarg_render_error()
Operating System
Ubuntu 20.04.4 LTS
Versions of Apache Airflow Providers
No response
Deployment
Virtualenv installation
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:9 (6 by maintainers)
Unsure if helpful - but tossing in my vote for this as well and thought to share my use case.
Expected
trigger_rule
would be respected rather than automatically failing downstream tasks. I have downstream task that pick a random choice from any successful upstream.In the example above - would expect the
choose_cluster
task to pick/passcluster_b
. Instead throwskey="return_value" is not found
as mentioned.@ashb Previously this did return
None
. However, I do see that this could be indeterminant for the cases where XCom value could have been set to None or it may not have been set. Would returning the NOTSET object instead of raising an error when it is seen work better? If that is done, then the difference between XCom is None or XCom wasn’t set, could be determined in the event that someone needed to be able to tell the difference between those cases.Edit: I think the most important runtime behavior is that the
XComArg.resolve
method returns a value instead of raising an error. When raising an error the task won’t run, even if by the trigger rules meant that it tried to run. The exact value returned, None or some sentinel value (NOTSET) that someone can check for should work, in my opinion.