question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Duplicate tasks invoked for a single task_id when manually invoked task details modal.

See original GitHub issue

Apache Airflow version: 1.10.11

Kubernetes version (if you are using kubernetes) (use kubectl version): NA

Environment:

  • Cloud provider or hardware configuration: AWS (EC2 instances)
  • OS (e.g. from /etc/os-release):
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
  • Kernel (e.g. uname -a): Linux airflow-scheduler-10-229-13-220 4.14.165-131.185.amzn2.x86_64 #1 SMP Wed Jan 15 14:19:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:

  • Others:

What happened:

When manually invoke a task from the task details dialog, we see the task running for approximately 22 seconds before we see the following appear in the log…

[2020-07-28 01:25:14,726] {local_task_job.py:150} WARNING - Recorded pid 26940 does not match the current pid 26751
[2020-07-28 01:25:14,728] {helpers.py:325} INFO - Sending Signals.SIGTERM to GPID 26757

The task then is killed. We notice this is accompanied with a second failure shortly afterwards that correlates to the new pid that has been written to the task_instance table.

It is interesting to note that if the task is scheduled as part of a normal dag run, or by clearing state and allowing the schedular to schedule its execution then we do not experience any issue.

We have attempted to specify task_concurrency on our operators with no effect.

What you expected to happen: We expected a single process to be spawned for the manually executed task.

How to reproduce it: Manually invoke a task via the task details dialog where that task execution is going to be longer than the heart rate interval that has been set.

The heart rate checks the pid and sees a mismatch and so kills the task.

Anything else we need to know:

We can produce this reliably if the task execution time is > than the heart rate interval.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:5
  • Comments:37 (17 by maintainers)

github_iconTop GitHub Comments

5reactions
chandu-007commented, Oct 28, 2021

We are also facing same issue using Composer 1.17.3 and Airflow 2.1.2

4reactions
noelmcloughlincommented, Sep 15, 2021

On 2.1.3 I saw that DAG containing sequential task flow works fine. The signal happens if DAG branches into parallel tasks. I rolled back to 2.1.2

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [airflow] ephraimbuddy commented on issue #10026 ...
[GitHub] [airflow] ephraimbuddy commented on issue #10026: Duplicate tasks invoked for a single task_id when manually invoked task details modal.
Read more >
changeTaskId Gantt Docs - DHTMLX Documentation
changeTaskID, using the original Gantt ID and unique Task ID from 3rd party database as parameters. 4. It doesn't seem like the task...
Read more >
Task-based asynchronous programming - .NET | Microsoft Learn
The following example shows a basic Invoke call that creates and starts two tasks that run concurrently. The first task is represented by...
Read more >
Task | ClearML
Use Task. init method to automatically create and populate task for the running process. To reference an existing Task, call the Task. get_task...
Read more >
Azure DevOps - get custom Task Reference ID - Stack Overflow
The task id has not changed every time when the custom task gets installed, but he existed in task.json of the task:
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found