tempfile.TemporaryDirectory does not get deleted after task failure
See original GitHub issueDiscussed in https://github.com/apache/airflow/discussions/22403
<div type='discussions-op-text'>Originally posted by m1racoli March 18, 2022
Apache Airflow version
2.2.4 (latest released)
What happened
When creating a temporary directory with tempfile.TemporaryDirectory() and then failing a task, the corresponding directory does not get deleted.
This happens in Airflow on Astronomer as well as locally in for astro dev setups for LocalExecutor and CeleryExecutor.
What you think should happen instead
As in normal Python environments, the directory should get cleaned up, even in the case of a raised exception.
How to reproduce
Running this DAG will leave a temporary directory in the corresponding location:
import os
import tempfile
from airflow.decorators import dag, task
from airflow.utils.dates import days_ago
class MyException(Exception):
pass
@task
def run():
tmpdir = tempfile.TemporaryDirectory()
print(f"directory {tmpdir.name} created")
assert os.path.exists(tmpdir.name)
raise MyException("error!")
@dag(start_date=days_ago(1))
def tempfile_test():
run()
_ = tempfile_test()
Operating System
Debian (Astronomer Airflow Docker image)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==1!3.0.0
apache-airflow-providers-cncf-kubernetes==1!3.0.2
apache-airflow-providers-elasticsearch==1!2.2.0
apache-airflow-providers-ftp==1!2.0.1
apache-airflow-providers-google==1!6.4.0
apache-airflow-providers-http==1!2.0.3
apache-airflow-providers-imap==1!2.2.0
apache-airflow-providers-microsoft-azure==1!3.6.0
apache-airflow-providers-mysql==1!2.2.0
apache-airflow-providers-postgres==1!3.0.0
apache-airflow-providers-redis==1!2.0.1
apache-airflow-providers-slack==1!4.2.0
apache-airflow-providers-sqlite==1!2.1.0
apache-airflow-providers-ssh==1!2.4.0
Deployment
Astronomer
Deployment details
GKE, vanilla astro dev, LocalExecutor and CeleryExecutor.
Anything else
Always
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (13 by maintainers)
Top Results From Across the Web
Undesired deletion of temporaly files - python - Stack Overflow
Instead of using a NamedTemporaryFile , you could use tempfile.TemporaryDirectory . The directory will be deleted when closed.
Read more >Issue 29982: tempfile.TemporaryDirectory fails to delete itself
msg291130 ‑ (view) Author: Max (max) * Date: 2017‑04‑04 18:26
msg292562 ‑ (view) Author: Guido van Rossum (gvanrossum) * Date: 2017‑04‑29 04:17
msg388219 ‑ (view)...
Read more >Setting the TMPDIR environment variable (Linux, UNIX) - IBM
To avoid accidental deletion, you can use the TMPDIR environment variable to specify a temporary directory. Note: User IDs that run the replication...
Read more >Failed to create temp directory, Tempfile mkdtemp delete, c ...
When the " tmp " directory is not created, submit the task, will be ... The user of mkdtemp() is responsible for deleting...
Read more >ansible.builtin.tempfile module – Creates temporary files and ...
If path is not specified, the default system temporary directory will be used. ... and the file module to remove the temporary file...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Awesome! Great work @potiuk!
Really nice one - fix in #22475 😄 - tested it with: