TensorBoard not automatically refreshing after 45 / 200 epochs in both Firefox and Google Chrome
See original GitHub issueEnvironment information (required)
Please run diagnose_tensorboard.py
(link below) in the same
environment from which you normally run TensorFlow/TensorBoard, and
paste the output here:
Diagnostics output
--- check: autoidentify
INFO: diagnose_tensorboard.py version 393931f9685bd7e0f3898d7dcdf28819fef54c43
--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0)
INFO: os.name: posix
INFO: os.uname(): posix.uname_result(sysname='Linux', nodename='0e3acfc06151', release='4.16.3-041603-generic', version='#201804190730 SMP Thu Apr 19 07:32:02 UTC 2018', machine='x86_64')
INFO: sys.getwindowsversion(): N/A
--- check: package_management
INFO: has conda-meta: True
INFO: $VIRTUAL_ENV: None
--- check: installed_packages
INFO: installed: tb-nightly==1.14.0a20190603
INFO: installed: tensorflow-gpu==2.0.0b0
INFO: installed: tf-estimator-nightly==1.14.0.dev2019060501
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '1.14.0a20190603'
--- check: tensorflow_python_version
INFO: tensorflow.__version__: '2.0.0-beta0'
INFO: tensorflow.__git_version__: 'v1.12.1-3259-gf59745a'
--- check: tensorboard_binary_path
INFO: which tensorboard: b'/opt/conda/bin/tensorboard\n'
--- check: readable_fqdn
INFO: socket.getfqdn(): '0e3acfc06151'
--- check: stat_tensorboardinfo
INFO: directory: /tmp/.tensorboard-info
INFO: .tensorboard-info directory does not exist
--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['/opt/conda/lib/python3.7/site-packages']; bad_roots (0): []
--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==0.7.1
asn1crypto==0.24.0
astor==0.8.0
attrs==19.1.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.6.16
cffi==1.12.2
chardet==3.0.4
conda==4.7.5
conda-package-handling==1.3.10
cryptography==2.6.1
cycler==0.10.0
decorator==4.4.0
defusedxml==0.5.0
eli5==0.9.0
entrypoints==0.3
gast==0.2.2
google-pasta==0.1.7
graphviz==0.11.1
grpcio==1.22.0
h5py==2.9.0
idna==2.8
imageio==2.5.0
ipykernel==5.1.1
ipython==7.6.1
ipython-genutils==0.2.0
jedi==0.14.0
Jinja2==2.10.1
joblib==0.13.2
json5==0.8.4
jsonschema==3.0.1
jupyter-client==5.2.4
jupyter-core==4.4.0
jupyterlab==1.0.1
jupyterlab-server==1.0.0
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
libarchive-c==2.8
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.1
mistune==0.8.4
nbconvert==5.5.0
nbformat==4.4.0
networkx==2.3
notebook==5.7.8
numpy==1.16.4
olefile==0.46
pandas==0.24.2
pandocfilters==1.4.2
parso==0.5.0
patsy==0.5.1
pexpect==4.7.0
pickleshare==0.7.5
Pillow==6.1.0
pip==19.0.3
prometheus-client==0.7.1
prompt-toolkit==2.0.9
protobuf==3.8.0
ptyprocess==0.6.0
PubChemPy==1.0.4
py2cytoscape==0.7.1
pycairo==1.18.0
pycosat==0.6.3
pycparser==2.19
pydot==1.4.1
pydotplus==2.0.2
Pygments==2.4.2
pyOpenSSL==19.0.0
pyparsing==2.4.0
pyrsistent==0.15.2
PySocks==1.6.8
python-dateutil==2.8.0
python-igraph==0.7.1.post7
pytz==2019.1
PyWavelets==1.0.3
pyzmq==18.0.2
requests==2.22.0
ruamel-yaml==0.15.46
scikit-image==0.15.0
scikit-learn==0.21.2
scipy==1.3.0
seaborn==0.9.0
Send2Trash==1.5.0
setuptools==41.0.0
singledispatch==3.4.0.3
six==1.12.0
src==0.1.0
statsmodels==0.10.0
tabulate==0.8.3
tb-nightly==1.14.0a20190603
tensorflow-gpu==2.0.0b0
termcolor==1.1.0
terminado==0.8.2
testpath==0.4.2
tf-estimator-nightly==1.14.0.dev2019060501
tornado==6.0.3
tqdm==4.32.2
traitlets==4.3.2
typing==3.6.4
urllib3==1.24.1
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.15.4
wheel==0.33.1
wrapt==1.11.2
For browser-related issues, please additionally specify:
chrome: Version 74.0.3729.131 (Official Build) (64-bit) Firefox: 67.0.4 (64-bit
Issue description
Posting for a colleague. Providing as much details as possible.
They have a tf.keras model with some log summaries. It logs successfully to either 11, 45, or 200 epochs, then stops automatically refreshing. Monitoring the http requests in the browser inspector shows that calls are made every 30 seconds to update, with status code 200. Response shows only the 11/45/200 epochs where it freezes. If the tensorboard
process is killed and then restarted, then all the data appears.
This suggests to me that whatever function is reading the log files somehow stops being able to read new lines added?
They have tried all combinations of TF 1.14, 1.15 and 2.0 with TB 1.4 and 1.15.
TF2a0 and TB 1.14.0a20190301 seems to work…
When using TB 1.15 they see:
W0708 10:40:08.985069 140053402789632 plugin_event_accumulator.py:294] Found more than one graph event per run, or there was a metagraph containing a graph_def, as well as one or more graph events. Overwriting the graph with the newest event.
E0708 10:40:15.037355 140053402789632 directory_watcher.py:242] File /mnt/project/data/logs/model/test_4/train/events.out.tfevents.1562574945.e5513778500b.1.134.v2 updated even though the current file is /mnt/project/data/logs/model/test_4/train/events.out.tfevents.1562574948.e5513778500b.profile-empty
relevant issue https://github.com/tensorflow/tensorboard/pull/2342 already resolve. Would appreciate insights
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Hi @SumNeuron—thanks for the report. You may be running into #2084. Could you try adding
profile_batch=0
to your Keras callback and seeing if that fixes the problem?@wchargin first thanks for your time and insight, what exactly is
profile_batch=0
I do not see it in the defs of keras callbacks