exp: Too many openfiles error in `exp` related commands.
See original GitHub issueBug Report
Sometimes on MacOS, we will meet the “too many open files error”. This can happen in exp run
, exp show
or inside celery work.
And the time it happening is also varied, setup, running, or result collection, I met this error in all stages. ulimit
to a larger number ( for example 1024 instead of default 256 ) can prevent this.
Traceback (most recent call last):
File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 66, in collect_exp
BaseStashQueue.collect_executor(
File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/base.py", line 643, in collect_executor
results = cls.collect_git(exp, executor, exec_result)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/decorators.py", line 45, in wrapper
return deco(call, *dargs, **dkwargs)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/flow.py", line 127, in retry
return call()
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/funcy/decorators.py", line 66, in __call__
return self._func(*self._args, **self._kwargs)
File "/Users/gao/Code/dvc/dvc/repo/experiments/utils.py", line 40, in wrapper
return f(exp, *args, **kwargs)
File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/base.py", line 623, in collect_git
for ref in executor.fetch_exps(
File "/Users/gao/Code/dvc/dvc/repo/experiments/executor/base.py", line 367, in fetch_exps
dest_scm.fetch_refspecs(
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/scmrepo/git/__init__.py", line 289, in _backend_func
result = func(*args, **kwargs)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 634, in fetch_refspecs
fetch_result = client.fetch(
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/client.py", line 1502, in fetch
refs = r.fetch(
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 427, in fetch
count, pack_data = self.fetch_pack_data(
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 460, in fetch_pack_data
objects = self.fetch_objects(
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/repo.py", line 494, in fetch_objects
obj = self.object_store[sha]
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 144, in __getitem__
type_num, uncomp = self.get_raw(sha)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 581, in get_raw
ret = self._get_loose_object(hexsha)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/object_store.py", line 745, in _get_loose_object
return ShaFile.from_path(path)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/objects.py", line 420, in from_path
with GitFile(path, "rb") as f:
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/dulwich/file.py", line 94, in GitFile
return io.open(filename, mode, bufsize)
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/.git/objects/55/4aa474348369ca4eec2945226686b9e4f11666'
[2022-10-25 16:25:09,522: ERROR/MainProcess] Task dvc.repo.experiments.queue.tasks.run_exp[cf2fd3f4-a0c2-4c42-80b9-9554e03d9a55] raised unexpected: OSError(24, 'Too many open files')
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 628, in _rmtree_safe_fd
with os.scandir(topfd) as scandir_it:
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/demo'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 734, in __protected_call__
return self.run(*args, **kwargs)
File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 111, in run_exp
cleanup_exp.s(executor, infofile)()
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/canvas.py", line 168, in __call__
return self.type(*args, **kwargs)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/trace.py", line 735, in __protected_call__
return orig(self, *args, **kwargs)
File "/Users/gao/test/vscode-dvc/demo/.venv/lib/python3.10/site-packages/celery/app/task.py", line 392, in __call__
return self.run(*args, **kwargs)
File "/Users/gao/Code/dvc/dvc/repo/experiments/queue/tasks.py", line 87, in cleanup_exp
executor.cleanup(infofile)
File "/Users/gao/Code/dvc/dvc/repo/experiments/executor/local.py", line 128, in cleanup
remove(self.root_dir)
File "/Users/gao/Code/dvc/dvc/utils/fs.py", line 69, in remove
shutil.rmtree(path, onerror=_chmod)
File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 724, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 657, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/opt/homebrew/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 632, in _rmtree_safe_fd
onerror(os.scandir, path, sys.exc_info())
File "/Users/gao/Code/dvc/dvc/utils/fs.py", line 54, in _chmod
func(p)
OSError: [Errno 24] Too many open files: '/Users/gao/test/vscode-dvc/demo/.dvc/tmp/exps/tmp9x5praul/demo'
Description
Reproduce
Expected
Environment information
Output of dvc doctor
:
$ dvc doctor
Additional Information (if any):
Issue Analytics
- State:
- Created a year ago
- Comments:18 (4 by maintainers)
Top Results From Across the Web
How to Fix the 'Too Many Open Files' Error in Linux?
It means that a process has opened too many files (file descriptors) and cannot open new ones. On Linux, the “max open file...
Read more >Too Many Open Files error message - IBM
This technote explains how to debug the "Too many open files" error message on Microsoft Windows, AIX, Linux and Solaris operating systems.
Read more >Fixing the “Too many open files” Error in Linux - Baeldung
When working with Linux servers, we may encounter the “Too many open files” error. In this article, we'll go over what this error...
Read more >[FIXED] Too many open files error on LiteSpeed Web Server
Run the following command to apply the changes. systctl -p; Create a conf file /etc/systemd/system/lsws.service.d/override.conf , with content ...
Read more >node and Error: EMFILE, too many open files - Stack Overflow
I used this command to test the number of files that were opened after doing various events in my app. lsof -i -n...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@karajan1001 I’m curious if anyone else on the vscode team has run into this same problem? Also, could you run
git count-objects -vH
in your clone of the vscode repo? (In your main clone of the repo, not in a temp exp workspace)What I am wondering is that if you aren’t making regular changes in the vscode repo (and are only ever using
dvc exp ...
for testing purposes), you also aren’t using CLI git in that clone regularly, so it may not be gc’d at all, which could also lead to dulwich hitting the file limit when we try to collect exps. There could be too many loose objects on the receiving end of the fetch (in the main repo) and not necessarily too many objects in the source end (in an exp temp workspace). A typical user would not hit this scenario, because they would presumably be using regular CLI git themselves in conjunction with DVC on a regular enough basis that git would perform a gc at some point.I think we can close this for now given that it is unlikely this will occur in normal DVC use (where users are at least semi-regularly also using CLI git commands)