Deleting DAG runs through the UI causes incomplete deletion of DAG run details. This affects tasks logging capabilities
See original GitHub issueApache Airflow version: 2.0.2
Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Environment: bitnami/airflow: 2.0.2
- Cloud provider or hardware configuration: AWS, EKS K8 cluster
- OS : Debian GNU/Linux 10 (buster)
- Kernel : Linux airflow-scheduler-64d8c676ff-h5zkk 4.14.209-160.339.amzn2.x86_64 #1 SMP Wed Dec 16 22:44:04 UTC 2020 x86_64 GNU/Linux
- Install tools: helm 3.2.4
- Others: KubernetesExecutor
What happened:
When deleting a DAG run through the Web UI, the entries in the table task_instance (Airflow DB) don’t get deleted.
I have noticed this whilst testing the logging to S3. When the DAG runs the first time, the logs are generated and stored to S3 (and Airflow UI) successfully. When the DAG run is deleted and the DAG runs again, the logs are not generated.
I did some debugging and I have noticed that deleting the DAG run through the UI doesn’t delete the entries for the tasks in the task_instance table. Deleting those entries manually, before the DAG runs again, has fixed the logging problem (logs are written again to S3 and UI).
What you expected to happen:
Deleting a DAG run through the UI should re-trigger the DAG run and the logs for the tasks in that run should be written to the destination that has been set up.
How to reproduce it:
- Run a DAG once
- Check that the logs have been written
- Delete the DAG run
- Check that no log has been written
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
If you only want to delete the DagRun to re-run it – “Clear” it instead
Cool, thanks @kaxil