question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ExternalTaskSensor can never find External Parent Task

See original GitHub issue

Apache Airflow version: 2.0

Kubernetes version (if you are using kubernetes) (use kubectl version): 1.18

Environment: Linux

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Linux
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened: Following the example below https://github.com/apache/airflow/blob/master/airflow/example_dags/example_external_task_marker_dag.py, when you trigger the parent task, it succeeds so then you trigger the child task it goes into reschedule loop until it times out because the execution date filter is only single list object of the child execution task which wont ever be the same as the parent. https://github.com/apache/airflow/blob/master/airflow/sensors/external_task.py#L216

IRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=example_external_task_marker_child
AIRFLOW_CTX_TASK_ID=child_task1
AIRFLOW_CTX_EXECUTION_DATE=2021-01-14T17:33:14.178268+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-01-14T17:33:14.178268+00:00
[2021-01-14 17:33:49,750] {external_task.py:153} INFO - Poking for example_external_task_marker_parent.parent_task on 2021-01-14T17:33:14.178268+00:00 ... 
[2021-01-14 17:33:49,889] {taskinstance.py:1386} INFO - Rescheduling task, marking task as UP_FOR_RESCHEDULE
[2021-01-14 17:33:49,928] {local_task_job.py:118} INFO - Task exited with return code 0
[2021-01-14 17:35:22,581] {taskinstance.py:826} INFO - Dependencies all met for <TaskInstance: example_external_task_marker_child.child_task1 2021-01-14T17:33:14.178268+00:00 [queued]>
[2021-01-14 17:35:22,626] {taskinstance.py:826} INFO - Dependencies all met for <TaskInstance: example_external_task_marker_child.child_task1 2021-01-14T17:33:14.178268+00:00 [queued]>
[2021-01-14 17:35:22,627] {taskinstance.py:1017} INFO - 
--------------------------------------------------------------------------------
[2021-01-14 17:35:22,627] {taskinstance.py:1018} INFO - Starting attempt 1 of 1
[2021-01-14 17:35:22,627] {taskinstance.py:1019} INFO - 
--------------------------------------------------------------------------------
[2021-01-14 17:35:22,657] {taskinstance.py:1038} INFO - Executing <Task(ExternalTaskSensor): child_task1> on 2021-01-14T17:33:14.178268+00:00
[2021-01-14 17:35:22,665] {standard_task_runner.py:51} INFO - Started process 24 to run task
[2021-01-14 17:35:22,678] {standard_task_runner.py:75} INFO - Running: ['airflow', 'tasks', 'run', 'example_external_task_marker_child', 'child_task1', '2021-01-14T17:33:14.178268+00:00', '--job-id', '53', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dags/examples/external_sensor.py', '--cfg-path', '/tmp/tmpzb2z00at']
[2021-01-14 17:35:22,680] {standard_task_runner.py:76} INFO - Job 53: Subtask child_task1
[2021-01-14 17:35:22,881] {logging_mixin.py:103} INFO - Running <TaskInstance: example_external_task_marker_child.child_task1 2021-01-14T17:33:14.178268+00:00 [running]> on host exampleexternaltaskmarkerchildchildtask1-e6a0f9157da74583bb7373
[2021-01-14 17:35:23,667] {taskinstance.py:1230} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=example_external_task_marker_child
AIRFLOW_CTX_TASK_ID=child_task1
AIRFLOW_CTX_EXECUTION_DATE=2021-01-14T17:33:14.178268+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-01-14T17:33:14.178268+00:00
[2021-01-14 17:35:23,695] {external_task.py:153} INFO - Poking for example_external_task_marker_parent.parent_task on 2021-01-14T17:33:14.178268+00:00 ... 
[2021-01-14 17:35:23,833] {taskinstance.py:1386} INFO - Rescheduling task, marking task as UP_FOR_RESCHEDULE
[2021-01-14 17:35:23,852] {local_task_job.py:118} INFO - Task exited with return code 0
[2021-01-14 17:36:59,125] {taskinstance.py:826} INFO - Dependencies all met for <TaskInstance: example_external_task_marker_child.child_task1 2021-01-14T17:33:14.178268+00:00 [queued]>
[2021-01-14 17:36:59,172] {taskinstance.py:826} INFO - Dependencies all met for <TaskInstance: example_external_task_marker_child.child_task1 2021-01-14T17:33:14.178268+00:00 [queued]>
[2021-01-14 17:36:59,172] {taskinstance.py:1017} INFO - 

What you expected to happen: I would expect the filter to be a range not a single timestamp? Or we should be able to send in a date instead of a datetime?

How to reproduce it: Run this Example https://github.com/apache/airflow/blob/master/airflow/example_dags/example_external_task_marker_dag.py

Anything else we need to know:

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
ephraimbuddycommented, Aug 10, 2021

@ephraimbuddy is the issue here only fixing example dag ?

Yes. @eladkal

0reactions
eladkalcommented, Aug 10, 2021

@ephraimbuddy is the issue here only fixing example dag ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow: ExternalTaskSensor doesn't work as expected ...
As you can see, I'm trying to emulate the situation when the parent task fails if err is specified in the text file...
Read more >
[GitHub] [airflow] ephraimbuddy edited a comment on issue #13681 ...
[GitHub] [airflow] ephraimbuddy edited a comment on issue #13681: ExternalTaskSensor can never find External Parent Task · 2021-01-14 Thread GitBox.
Read more >
airflow.sensors.external_task_sensor — Airflow Documentation
ExternalTaskSensor (external_dag_id, external_task_id ... is None), and immediately cease waiting if the external task or DAG does not exist (default value: ...
Read more >
Airflow Sensors : What you need to know - Marc Lamberti
Airflow Sensors are a very common type of operators used in DAGs. Why? ... The ExternalTaskSensor: Waits for a different DAG or a...
Read more >
Managing dependencies between data pipelines in Apache ...
This means that the parent DAG doesn't wait until the triggered child DAG is finished before starting the next task! This is not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found