question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[tune] Crash with ValueError: I/O operation on closed file

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Ray installed from (source or binary): binary
  • Ray version: 0.6.5 and 0.6.6
  • Python version: 3.6.5
  • Exact command to reproduce:

I ran tune experiments on GCP, and on all the experiments that shared the cluster with one or more experiments crashed with the following:

Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/softlearning/bin/softlearning", line 11, in <module>
    load_entry_point('softlearning', 'console_scripts', 'softlearning')()
  File "/home/ubuntu/softlearning/softlearning/scripts/console_scripts.py", line 202, in main
    return cli()
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/ubuntu/softlearning/softlearning/scripts/console_scripts.py", line 95, in run_example_cluster_cmd
    run_example_cluster(example_module_name, example_argv)
  File "/home/ubuntu/softlearning/examples/instrument.py", line 283, in run_example_cluster
    reuse_actors=True)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/tune.py", line 235, in run
    runner.step()
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 273, in step
    self._process_events()  # blocking
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 439, in _process_events
    self._process_trial(trial)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 504, in _process_trial
    trial, error=True, error_msg=error_msg)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 233, in stop_trial
    trial, error=error, error_msg=error_msg, stop_logger=stop_logger)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 160, in _stop_trial
    trial.close_logger()
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial.py", line 391, in close_logger
    self.result_logger.close()
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/logger.py", line 233, in close
    self._log_syncer.sync_now(force=False)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/log_sync.py", line 222, in sync_now
    final_cmd, shell=True, stdout=self.logfile)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/subprocess.py", line 667, in __init__
    errread, errwrite) = self._get_handles(stdin, stdout, stderr)
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/subprocess.py", line 1184, in _get_handles
    c2pwrite = stdout.fileno()
  File "/home/ubuntu/.conda/envs/softlearning/lib/python3.6/tempfile.py", line 483, in func_wrapper
    return func(*args, **kwargs)
ValueError: I/O operation on closed file

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
richardliawcommented, Jul 2, 2019

OK this happens during restoration and should be addressed in #5053

0reactions
hartikainencommented, May 14, 2019

The original ValueError: I/O operation on closed file might actually not be the actual cause of the failure, but just some error that happens in the restoring or something. I saw this same behavior locally after some trials failed due to a bug in the application, and only at the time of restore I saw the ValueError: I/O operation on closed file.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python ValueError: I/O operation on closed file Solution
The “ValueError : I/O operation on closed file” error is raised when you try to read from or write to a file that...
Read more >
ValueError : I/O operation on closed file - Stack Overflow
I had this problem when I was using an undefined variable inside the with open(...) as f: . I removed (or I defined...
Read more >
joblib Documentation - Read the Docs
Using the 'multiprocessing' backend can cause a crash when using third party libraries that manage their own native thread-pool if the library ...
Read more >
Change history for coverage.py - Read the Docs
Now the signal handler is only used if you opt-in by setting [run] sigterm = true . Small changes to the HTML report:...
Read more >
Async IO in Python: A Complete Walkthrough
Parallelism consists of performing multiple operations at the same time. Multiprocessing is a means to effect parallelism, and it entails spreading tasks over...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found