question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TensorBoard: gracefully handle deleted event files

See original GitHub issue

as far as i can tell this is exactly the same as this issue https://github.com/tensorflow/tensorflow/issues/3267

if i delete files from the logdir while tensorboard is running i get things like

E0911 11:27:19.441699 139989077399296 plugin_event_multiplexer.py:226] Unable to reload accumulator 'srresnet_voc_2x': [Errno 2] No such file or directory: b'/home/maksim/data/tensorboard/srresnet_voc_2x/events.out.tfevents.1568215513.maksim-desktop.105092.0'
E0911 11:27:24.447706 139989077399296 plugin_event_multiplexer.py:226] Unable to reload accumulator 'srresnet_voc_2x': [Errno 2] No such file or directory: b'/home/maksim/data/tensorboard/srresnet_voc_2x/events.out.tfevents.1568215513.maksim-desktop.105092.0'
E0911 11:27:29.453577 139989077399296 plugin_event_multiplexer.py:226] Unable to reload accumulator 'srresnet_voc_2x': [Errno 2] No such file or directory: b'/home/maksim/data/tensorboard/srresnet_voc_2x/events.out.tfevents.1568215513.maksim-desktop.105092.0'
E0911 11:27:34.459517 139989077399296 plugin_event_multiplexer.py:226] Unable to reload accumulator 'srresnet_voc_2x': [Errno 2] No such file or directory: b'/home/maksim/data/tensorboard/srresnet_voc_2x/events.out.tfevents.1568215513.maksim-desktop.105092.0'

i’m using tb-nightly==1.15.0a20190911 through pytorch.

i’m not sure when reaping is supposed to happen e.g. as in https://github.com/tensorflow/tensorflow/issues/3267#issuecomment-292911229 or how to manually force

WARNING:tensorflow:Deleting accumulator 'run1/test'
WARNING:tensorflow:Deleting accumulator 'run1'
WARNING:tensorflow:Deleting accumulator 'run2/test'
WARNING:tensorflow:Deleting accumulator 'run2'

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:2
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
thincalcommented, Dec 4, 2019

@stephanwlee one important thing forget to mention, I just run the tensorboard without tensorflow installed, you could try it again.

1reaction
thincalcommented, Oct 31, 2019

@stephanwlee

Raised exception causes the application “Reloader” thread exited, so that the graph won’t be updated anymore.

tensorflow.python.framework.errors_impl.NotFoundError: Could not find directory xxxx
tensorboard.backend.event_processing.directory_watcher.DirectoryDeletedError: Directory xxxx deleted

This seems like a bug, what I expect behavior would be:

  • gracefully handle this DirectoryDeletedError exception without crash
  • remove the graph of deleted run from the web

FYI, full callstack:

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/abc/Library/Python/3.7/lib/python/site-packages/tensorboard/backend/application.py", line 502, in _reload
    multiplexer.AddRunsFromDirectory(path, name)
  File "/Users/abc/Library/Python/3.7/lib/python/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 193, in AddRunsFromDirectory
    self.AddRun(subdir, name=subname)
  File "/Users/abc/Library/Python/3.7/lib/python/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 158, in AddRun
    accumulator.Reload()
  File "/Users/abc/Library/Python/3.7/lib/python/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 177, in Reload
    for event in self._generator.Load():
  File "/Users/abc/Library/Python/3.7/lib/python/site-packages/tensorboard/backend/event_processing/directory_watcher.py", line 94, in Load
    'Directory %s has been permanently deleted' % self._directory)
tensorboard.backend.event_processing.directory_watcher.DirectoryDeletedError: Directory /Users/abc/test/test_delete/run2 has been permanently deleted
Read more comments on GitHub >

github_iconTop Results From Across the Web

Developers - TensorBoard: gracefully handle deleted event files -
Coming soon: A brand new website interface for an even better experience!
Read more >
Why old nodes are visible even after deleting event files ...
I tried to resolve this( remove the older,dead nodes) by: 1. Deleting the event files. 2. Deleting the whole directory containing multiple event...
Read more >
tensorflow/tensorboard - GitLab
Once you have event files, run TensorBoard and provide the log directory. ... (Note: There's a known issue where TensorBoard does not handle...
Read more >
Using Finalizers to Control Deletion - Kubernetes
Here's a demonstration of using the patch command to remove finalizers. If we want to delete an object, we can simply patch it...
Read more >
Release 3.5.2 Gev Sogomonian, Gor Arakelyan et al. - Aim
Aim is built to handle 1000s of training runs with dozens of experiments each ... Aim gives your possibility to convert TensorFlow event...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found