question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[tune] tf.summary.FileWriter extensibility for custom TensorBoard metrics

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • Ray installed from (source or binary): source
  • Ray version: 0.6.6
  • Python version: 3.6.7
  • Exact command to reproduce: NA

Context: I rely on tune and tensorboard for visualizing training while using callbacks to define custom metrics in the dictionary results then passed to TFLogger.

Problem: ray saves scalars only, and all of them are saved under the same tab ‘ray’ in tensorboard. Having tens of metrics under the same tab does not help readability, in particular if the end user is adding custom metrics. It would be a great feature to let users access TFLogger._file_writer so that they can add custom metrics (not just scalars) in custom tabs. Note that creating a second tf.summary.FileWriter is not an option as two FileWriter sharing the same logdir are not supported at this time. Question: what’s the recommended way to achieve that?

Attempts: using a custom Logger instance is not an option as the trainer is never passed to (only results) and this limits the access to possible custom metrics of interest. Using the callback on_train_result does pass the ‘trainer’ (info[‘trainer’]) but from there I don’t see how it possible to access TFLogger._file_writer to save custom metrics in tensorboard.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:22 (15 by maintainers)

github_iconTop GitHub Comments

2reactions
FedericoFontanacommented, May 14, 2019

I’m happy to give it a try when you give me the ok.

1reaction
OnTheRickycommented, Apr 30, 2020

@richardliaw I did as you recommended (no TFLogger, FileWriter instanced in Trainer._init) and it works like a charm. Note that I’ve provided the computation graph when initializing FileWriter. I think that rllib/tune should save the computation graph by default as it is invaluable both for developers (debugging) and users (understand/visualize policy network without going through any source code).

What should the next step be (e.g. if PR, what should the PR change)?

class PPO(PPOTrainer):
    def _init(self, config, env_creator):
        super()._init(config, env_creator)
        self._file_writer = tf.summary.FileWriter(
            logdir=self.logdir,
            graph=self.get_policy().sess.graph,
        )
        self._file_writer.flush()

image image image

Has this been implemented already? If so, what changes are required to see the graph in tensorboard?

Read more comments on GitHub >

github_iconTop Results From Across the Web

TensorBoard Scalars: Logging training metrics in Keras
Retrain the regression model and log a custom learning rate. Here's how: Create a file writer, using tf.summary.create_file_writer() . Define a ...
Read more >
Inside TensorFlow: Summaries and TensorBoard - YouTube
Learn how TensorBoard and the tf. summary API work together to visualize your data, including details about API changes, log directories, ...
Read more >
Converting a TensorFlow XLNet Model
1 under Python2. from collections import namedtuple import tensorflow as tf ...
Read more >
How to add custom summaries to tensorboard when training ...
TensorFlow callback TensorBoardWithTime defined below logs cumulative training and evaluation batch time.
Read more >
How to Log MXNet Data for Visualization in TensorBoard
We adapted the following low-level logging components from their Python and C++ implementations in TensorFlow: FileWriter , EventFileWriter , EventsWriter , ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found