question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tensorboard frontend does not update when the log directory changes.

See original GitHub issue

Environment information (required)

Diagnostics

Diagnostics output
--- check: autoidentify
INFO: diagnose_tensorboard.py version b5843ba83bb708385ff54baaab4b2c70c39f7a4f

--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0)
INFO: os.name: posix
INFO: os.uname(): posix.uname_result(sysname='Linux', nodename='c8278555cf9a', release='4.19.69-1-MANJARO', version='#1 SMP PREEMPT Thu Aug 29 08:51:46 UTC 2019', machine='x86_64')
INFO: sys.getwindowsversion(): N/A

--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: None

--- check: installed_packages
INFO: installed: tensorboard==1.14.0
diagnose_tensorboard.py:197: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
  logging.warn("no installation among: %s", sorted(family))
WARNING: no installation among: ['tensorflow', 'tensorflow-gpu', 'tf-nightly', 'tf-nightly-2.0-preview', 'tf-nightly-gpu', 'tf-nightly-gpu-2.0-preview']
WARNING: no installation among: ['tensorflow-estimator', 'tensorflow-estimator-2.0-preview', 'tf-estimator-nightly']

--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '1.14.0'

--- check: tensorflow_python_version
Traceback (most recent call last):
  File "diagnose_tensorboard.py", line 419, in main
    suggestions.extend(check())
  File "diagnose_tensorboard.py", line 77, in wrapper
    result = fn()
  File "diagnose_tensorboard.py", line 236, in tensorflow_python_version
    import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

--- check: tensorboard_binary_path
INFO: which tensorboard: b'/usr/local/bin/tensorboard\n'

--- check: readable_fqdn
INFO: socket.getfqdn(): 'c8278555cf9a'

--- check: stat_tensorboardinfo
INFO: directory: /tmp/.tensorboard-info
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=19925017, st_dev=48, st_nlink=2, st_uid=0, st_gid=0, st_size=4096, st_atime=1569252142, st_mtime=1569252142, st_ctime=1569252142)
INFO: mode: 0o40777

--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['/usr/local/lib/python3.7/site-packages']; bad_roots (0): []

--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==0.8.0
grpcio==1.23.0
Markdown==3.1.1
numpy==1.17.2
pip==19.2.3
protobuf==3.9.1
setuptools==41.2.0
six==1.12.0
tensorboard==1.14.0
Werkzeug==0.16.0
wheel==0.33.6

Issue description

Please describe the bug as clearly as possible. How can we reproduce the problem without additional resources (including external data files and proprietary Python modules)?

I am trying to run the tensorboard server in a docker container but there seems to be problems with updating the log directory. Tensorboard will load a single file and then be stuck in that state even if I manually go into the container and delete the files. For example, I have 4 folders each with a tfevents file and this is what the resulting frontend is stuck at:

image

An exception is raised in the Reloader thread during spin up of the server which seems related to the issue. Output of docker run:

/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/usr/local/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
TensorFlow installation not found - running with reduced feature set.
Exception in thread Reloader:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/application.py", line 430, in _reload
    multiplexer.Reload()
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 240, in Reload
    Worker()
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 218, in Worker
    accumulator.Reload()
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 177, in Reload
    for event in self._generator.Load():
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/directory_watcher.py", line 89, in Load
    for event in self._LoadInternal():
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/directory_watcher.py", line 113, in _LoadInternal
    for event in self._loader.Load():
  File "/usr/local/lib/python3.7/site-packages/tensorboard/backend/event_processing/event_file_loader.py", line 95, in Load
    yield event_pb2.Event.FromString(record)
google.protobuf.message.DecodeError: Error parsing message

TensorBoard 1.14.0 at http://c8278555cf9a:6006/ (Press CTRL+C to quit)

If i use the official tensorflow image everything works fine. However, the problem with that solution is that the image is too big at 1.2 gigs. I really just need a barebones tensorboard.

Steps to reproduce:

  1. Use this Dockerfile:
FROM python:3.7-slim

RUN pip install --upgrade pip
RUN pip install tensorboard==1.14

COPY diagnose_tensorboard.py .

VOLUME /logs

EXPOSE 6006

ENTRYPOINT tensorboard --logdir /logs

  1. Build the image: docker build -t my-tb .

  2. Run a container with mounted tfevent files: docker run -it --rm -p 6006:6006 --name tb_test -v $(realpath tb_files):/logs my-tb /bin/bash -c "tensorboard --logdir /logs"

  3. Go to localhost:6006

  4. Observe issue. Try deleting tfevent files in the container and notice that the frontend never updates.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:10

github_iconTop GitHub Comments

6reactions
czhang96commented, Nov 2, 2019

Same issue here

0reactions
LucaMarconatocommented, Oct 23, 2020

Same here

Read more comments on GitHub >

github_iconTop Results From Across the Web

What's the best way to refresh TensorBoard after new events ...
Re-running my Python app results in a new log file being created with potentially new events/graph. However, TensorBoard does not seem to notice ......
Read more >
Deep Dive Into TensorBoard: Tutorial With Examples
Once that is done you have to set a log directory. This is where TensorBoard will store all the logs. It will read...
Read more >
Migrating tf.summary usage to TF 2.x | TensorBoard
TensorFlow 2.x includes significant changes to the tf.summary API used to write summary data for visualization in TensorBoard. What's changed.
Read more >
Tensorboard quick start in 5 minutes. - Anthony Sarkis - Medium
Note this was written in 2017 and has not been updated. Tensorboard is a web app to view information about your Tensorflow app....
Read more >
tensorboardX documentation - Read the Docs
If logdir is assigned, this argument has no effect. ... generated sprite image can be loaded by the Tensorboard frontend (see tensorboardX#516 for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found