question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Deleting DAG runs through the UI causes incomplete deletion of DAG run details. This affects tasks logging capabilities

See original GitHub issue

Apache Airflow version: 2.0.2

Kubernetes version:

Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Environment: bitnami/airflow: 2.0.2

  • Cloud provider or hardware configuration: AWS, EKS K8 cluster
  • OS : Debian GNU/Linux 10 (buster)
  • Kernel : Linux airflow-scheduler-64d8c676ff-h5zkk 4.14.209-160.339.amzn2.x86_64 #1 SMP Wed Dec 16 22:44:04 UTC 2020 x86_64 GNU/Linux
  • Install tools: helm 3.2.4
  • Others: KubernetesExecutor

What happened:

When deleting a DAG run through the Web UI, the entries in the table task_instance (Airflow DB) don’t get deleted.

I have noticed this whilst testing the logging to S3. When the DAG runs the first time, the logs are generated and stored to S3 (and Airflow UI) successfully. When the DAG run is deleted and the DAG runs again, the logs are not generated.

I did some debugging and I have noticed that deleting the DAG run through the UI doesn’t delete the entries for the tasks in the task_instance table. Deleting those entries manually, before the DAG runs again, has fixed the logging problem (logs are written again to S3 and UI).

What you expected to happen:

Deleting a DAG run through the UI should re-trigger the DAG run and the logs for the tasks in that run should be written to the destination that has been set up.

How to reproduce it:

  1. Run a DAG once
  2. Check that the logs have been written
  3. Delete the DAG run
  4. Check that no log has been written

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
kaxilcommented, May 15, 2021

If you only want to delete the DagRun to re-run it – “Clear” it instead

0reactions
darthalecommented, May 18, 2021

Cool, thanks @kaxil

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deleting dag runs - Google Groups
No, deleting DagRuns doesn't affect related task instances. Tyrone Hinderson's profile photo. Tyrone ...
Read more >
successful DAG run fails to be scheduled after being manually ...
Deleting DAG runs through the UI causes incomplete deletion of DAG run details. This affects tasks logging capabilities #15818.
Read more >
In Cloud composer1, dag last task fail with incomplete log on ...
Incomplete logs often means the Airflow worker pod was evicted, which is usually when a node in a Kubernetes cluster is running out...
Read more >
Deleting files on Amazon S3
Apache Airflow preserves historical DAG runs. After a DAG has been run in Apache Airflow, it remains in the Airflow DAGs list regardless...
Read more >
7 Common Errors to Check When Debugging Airflow DAGs
Tasks not running? DAG stuck? Logs nowhere to be found? We've been there. Here's a list of common snags and some corresponding fixes...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found