question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

exp: Too many openfiles error in `exp` related commands.

See original GitHub issue

Bug Report

Sometimes on MacOS, we will meet the “too many open files error”. This can happen in exp run, exp show or inside celery work.
And the time it happening is also varied, setup, running, or result collection, I met this error in all stages. ulimit to a larger number ( for example 1024 instead of default 256 ) can prevent this.

Traceback (most recent call last):
  File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 66, in collect_exp
    BaseStashQueue.collect_executor(
  File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/base.py", line 643, in collect_executor
    results = cls.collect_git(exp, executor, exec_result)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/decorators.py", line 45, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/flow.py", line 127, in retry
    return call()
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/decorators.py", line 66, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/Users/gao/Code/dvc/dvc/repo/experiments/utils.py", line 40, in wrapper
    return f(exp, *args, **kwargs)
  File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/base.py", line 623, in collect_git
    for ref in executor.fetch_exps(
  File "/Users/gao/Code/dvc/dvc/repo/experiments/executor/base.py", line 367, in fetch_exps
    dest_scm.fetch_refspecs(
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/scmrepo/git/__init__.py", line 289, in _backend_func
    result = func(*args, **kwargs)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 634, in fetch_refspecs
    fetch_result = client.fetch(
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/client.py", line 1502, in fetch
    refs = r.fetch(
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 427, in fetch
    count, pack_data = self.fetch_pack_data(
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 460, in fetch_pack_data
    objects = self.fetch_objects(
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 494, in fetch_objects
    obj = self.object_store[sha]
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 144, in __getitem__
    type_num, uncomp = self.get_raw(sha)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 581, in get_raw
    ret = self._get_loose_object(hexsha)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 745, in _get_loose_object
    return ShaFile.from_path(path)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/objects.py", line 420, in from_path
    with GitFile(path, "rb") as f:
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/file.py", line 94, in GitFile
    return io.open(filename, mode, bufsize)
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/.git/objects/55/4aa474348369ca4eec2945226686b9e4f11666'
[2022-10-25 16:25:09,522: ERROR/MainProcess] Task dvc.repo.experiments.queue.tasks.run_exp[cf2fd3f4-a0c2-4c42-80b9-9554e03d9a55] raised unexpected: OSError(24, 'Too many open files')
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 628, in _rmtree_safe_fd
    with os.scandir(topfd) as scandir_it:
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/demo'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 111, in run_exp
    cleanup_exp.s(executor, infofile)()
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/canvas.py", line 168, in __call__
    return self.type(*args, **kwargs)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 735, in __protected_call__
    return orig(self, *args, **kwargs)
  File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/task.py", line 392, in __call__
    return self.run(*args, **kwargs)
  File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 87, in cleanup_exp
    executor.cleanup(infofile)
  File "/Users/gao/Code/dvc/dvc/repo/experiments/executor/local.py", line 128, in cleanup
    remove(self.root_dir)
  File "/Users/gao/Code/dvc/dvc/utils/fs.py", line 69, in remove
    shutil.rmtree(path, onerror=_chmod)
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 724, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 657, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 632, in _rmtree_safe_fd
    onerror(os.scandir, path, sys.exc_info())
  File "/Users/gao/Code/dvc/dvc/utils/fs.py", line 54, in _chmod
    func(p)
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/demo'

Description

Reproduce

Expected

Environment information

Output of dvc doctor:

$ dvc doctor

Additional Information (if any):

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:18 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
pmrowlacommented, Oct 28, 2022

@karajan1001 I’m curious if anyone else on the vscode team has run into this same problem? Also, could you run git count-objects -vH in your clone of the vscode repo? (In your main clone of the repo, not in a temp exp workspace)

What I am wondering is that if you aren’t making regular changes in the vscode repo (and are only ever using dvc exp ... for testing purposes), you also aren’t using CLI git in that clone regularly, so it may not be gc’d at all, which could also lead to dulwich hitting the file limit when we try to collect exps. There could be too many loose objects on the receiving end of the fetch (in the main repo) and not necessarily too many objects in the source end (in an exp temp workspace). A typical user would not hit this scenario, because they would presumably be using regular CLI git themselves in conjunction with DVC on a regular enough basis that git would perform a gc at some point.

1reaction
pmrowlacommented, Nov 1, 2022

I think we can close this for now given that it is unlikely this will occur in normal DVC use (where users are at least semi-regularly also using CLI git commands)

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Fix the 'Too Many Open Files' Error in Linux?
It means that a process has opened too many files (file descriptors) and cannot open new ones. On Linux, the “max open file...
Read more >
Too Many Open Files error message - IBM
This technote explains how to debug the "Too many open files" error message on Microsoft Windows, AIX, Linux and Solaris operating systems.
Read more >
Fixing the “Too many open files” Error in Linux - Baeldung
When working with Linux servers, we may encounter the “Too many open files” error. In this article, we'll go over what this error...
Read more >
[FIXED] Too many open files error on LiteSpeed Web Server
Run the following command to apply the changes. systctl -p; Create a conf file /etc/systemd/system/lsws.service.d/override.conf , with content ...
Read more >
node and Error: EMFILE, too many open files - Stack Overflow
I used this command to test the number of files that were opened after doing various events in my app. lsof -i -n...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found