question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

tempfile.TemporaryDirectory does not get deleted after task failure

See original GitHub issue

Discussed in https://github.com/apache/airflow/discussions/22403

<div type='discussions-op-text'>

Originally posted by m1racoli March 18, 2022

Apache Airflow version

2.2.4 (latest released)

What happened

When creating a temporary directory with tempfile.TemporaryDirectory() and then failing a task, the corresponding directory does not get deleted.

This happens in Airflow on Astronomer as well as locally in for astro dev setups for LocalExecutor and CeleryExecutor.

What you think should happen instead

As in normal Python environments, the directory should get cleaned up, even in the case of a raised exception.

How to reproduce

Running this DAG will leave a temporary directory in the corresponding location:

import os
import tempfile

from airflow.decorators import dag, task
from airflow.utils.dates import days_ago


class MyException(Exception):
    pass


@task
def run():
    tmpdir = tempfile.TemporaryDirectory()
    print(f"directory {tmpdir.name} created")
    assert os.path.exists(tmpdir.name)

    raise MyException("error!")


@dag(start_date=days_ago(1))
def tempfile_test():
    run()


_ = tempfile_test()

Operating System

Debian (Astronomer Airflow Docker image)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==1!3.0.0
apache-airflow-providers-cncf-kubernetes==1!3.0.2
apache-airflow-providers-elasticsearch==1!2.2.0
apache-airflow-providers-ftp==1!2.0.1
apache-airflow-providers-google==1!6.4.0
apache-airflow-providers-http==1!2.0.3
apache-airflow-providers-imap==1!2.2.0
apache-airflow-providers-microsoft-azure==1!3.6.0
apache-airflow-providers-mysql==1!2.2.0
apache-airflow-providers-postgres==1!3.0.0
apache-airflow-providers-redis==1!2.0.1
apache-airflow-providers-slack==1!4.2.0
apache-airflow-providers-sqlite==1!2.1.0
apache-airflow-providers-ssh==1!2.4.0

Deployment

Astronomer

Deployment details

GKE, vanilla astro dev, LocalExecutor and CeleryExecutor.

Anything else

Always

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

</div>

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:14 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
m1racolicommented, Mar 22, 2022

Awesome! Great work @potiuk!

1reaction
potiukcommented, Mar 22, 2022

Really nice one - fix in #22475 😄 - tested it with:

import os
from time import sleep
import tempfile


def test():
    tmpdir = tempfile.TemporaryDirectory()
    print(f"directory {tmpdir.name} created")
    assert os.path.exists(tmpdir.name)
    raise Exception("exiting")


pid = os.fork()
if pid:
    sleep(2)
else:

    try:
        test()
    except BaseException:
        pass
    finally:
        pass
    os._exit(0)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Undesired deletion of temporaly files - python - Stack Overflow
Instead of using a NamedTemporaryFile , you could use tempfile.TemporaryDirectory . The directory will be deleted when closed.
Read more >
Issue 29982: tempfile.TemporaryDirectory fails to delete itself
msg291130 ‑ (view) Author: Max (max) * Date: 2017‑04‑04 18:26 msg292562 ‑ (view) Author: Guido van Rossum (gvanrossum) * Date: 2017‑04‑29 04:17 msg388219 ‑ (view)...
Read more >
Setting the TMPDIR environment variable (Linux, UNIX) - IBM
To avoid accidental deletion, you can use the TMPDIR environment variable to specify a temporary directory. Note: User IDs that run the replication...
Read more >
Failed to create temp directory, Tempfile mkdtemp delete, c ...
When the " tmp " directory is not created, submit the task, will be ... The user of mkdtemp() is responsible for deleting...
Read more >
ansible.builtin.tempfile module – Creates temporary files and ...
If path is not specified, the default system temporary directory will be used. ... and the file module to remove the temporary file...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found