[tune] Crash with ValueError: I/O operation on closed file
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
- Ray installed from (source or binary): binary
- Ray version: 0.6.5 and 0.6.6
- Python version: 3.6.5
- Exact command to reproduce:
I ran tune experiments on GCP, and on all the experiments that shared the cluster with one or more experiments crashed with the following:
Traceback (most recent call last):
File "/home/ubuntu/.conda/envs/softlearning/bin/softlearning", line 11, in <module>
load_entry_point('softlearning', 'console_scripts', 'softlearning')()
File "/home/ubuntu/softlearning/softlearning/scripts/console_scripts.py", line 202, in main
return cli()
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/ubuntu/softlearning/softlearning/scripts/console_scripts.py", line 95, in run_example_cluster_cmd
run_example_cluster(example_module_name, example_argv)
File "/home/ubuntu/softlearning/examples/instrument.py", line 283, in run_example_cluster
reuse_actors=True)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/tune.py", line 235, in run
runner.step()
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 273, in step
self._process_events() # blocking
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 439, in _process_events
self._process_trial(trial)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 504, in _process_trial
trial, error=True, error_msg=error_msg)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 233, in stop_trial
trial, error=error, error_msg=error_msg, stop_logger=stop_logger)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 160, in _stop_trial
trial.close_logger()
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial.py", line 391, in close_logger
self.result_logger.close()
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/logger.py", line 233, in close
self._log_syncer.sync_now(force=False)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/log_sync.py", line 222, in sync_now
final_cmd, shell=True, stdout=self.logfile)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/subprocess.py", line 667, in __init__
errread, errwrite) = self._get_handles(stdin, stdout, stderr)
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/subprocess.py", line 1184, in _get_handles
c2pwrite = stdout.fileno()
File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/tempfile.py", line 483, in func_wrapper
return func(*args, **kwargs)
ValueError: I/O operation on closed file
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Python ValueError: I/O operation on closed file Solution
The “ValueError : I/O operation on closed file” error is raised when you try to read from or write to a file that...
Read more >ValueError : I/O operation on closed file - Stack Overflow
I had this problem when I was using an undefined variable inside the with open(...) as f: . I removed (or I defined...
Read more >joblib Documentation - Read the Docs
Using the 'multiprocessing' backend can cause a crash when using third party libraries that manage their own native thread-pool if the library ...
Read more >Change history for coverage.py - Read the Docs
Now the signal handler is only used if you opt-in by setting [run] sigterm = true . Small changes to the HTML report:...
Read more >Async IO in Python: A Complete Walkthrough
Parallelism consists of performing multiple operations at the same time. Multiprocessing is a means to effect parallelism, and it entails spreading tasks over...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
OK this happens during restoration and should be addressed in #5053
The original
ValueError: I/O operation on closed file
might actually not be the actual cause of the failure, but just some error that happens in the restoring or something. I saw this same behavior locally after some trials failed due to a bug in the application, and only at the time of restore I saw theValueError: I/O operation on closed file
.