question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tensorboard hparams plugin API reports metrics from final epoch, *not* epoch with best performance on metric tracked via EarlyStopping

See original GitHub issue

System information

  • TensorFlow version: 2.3.0
  • Are you willing to contribute it: Yes, with guidance/help to know where to look

Describe the feature and the current behavior/state. Feature would change current behavior of Tensorboard. Current behavior is that Tensorboard displays the validation loss from the final epoch of training, which is not useful when comparing different models to each other.

Will this change the current API? How? This will change tensorboard.plugins.hparams.api to report the performance of the best training epoch, not just the final epoch. Since the purpose of validation is to detect overfitting, this is in line with the reasons for performing it.

Who will benefit with this feature? Everyone who uses Tensorboard to compare models to each other based on the performance of the best training epoch.

Any Other info. I am currently using a standard tf.keras.callbacks.Tensorboard instance, sub-classed with the following method as the only modification:

    def on_train_end(self, logs=None):
        if os.path.exists(os.path.join(self.log_dir, "train", "plugins")):
            shutil.rmtree(os.path.join(self.log_dir, "train", "plugins"))

I am also creating a callback with the following code:

from tensorboard.plugins.hparams import api as hp
hp.hparams_config(
    hparams=hparams_list,
    metrics=[hp.Metric(CategoricalAccuracy().name, display_name=CategoricalAccuracy().name)],
)
hp_callback = hp.KerasCallback(writer=output_dir, hparams=session_hparams)

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
nfeltcommented, Feb 5, 2021

Thanks for the details @brethvoice. I think it might make the most sense as a UI option that lets the user dynamically select whether to show the latest, max, or min value of the metric as the representative value. Here’s a mockup:

image

Does that understanding match your use case?

If you’re interesting in contributing this feature I can provide some pointers on the relevant code that would need to be changed.

1reaction
psybuzzcommented, Mar 16, 2021

does that mean I need to close the issue?

Not at all! To chime in on the thread, having a way to view the “best” value of a metric in the Hparams dashboard sounds like a very reasonable request, and keeping this issue open is helpful for the TensorBoard team to keep track of what issues are most important to users.

In general, feature requests are typically only closed when it is irrelevant to TB, a duplicate of another issue, obsolete, already fixed in a newer version, or not a product priority for the TensorBoard team etc. On the other hand, an issue remaining ‘open’ does not necessarily mean that it will be fixed soon, given that the TensorBoard team has to balance priorities and work with limited resources.

Contributions are welcome, but are certainly not expected as an obligation to users.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tensorboard hparams plugin API reports metrics from final ...
Tensorboard hparams plugin API reports metrics from final epoch, *not* epoch with best performance on metric tracked via EarlyStopping #4630.
Read more >
Hyperparameter Tuning with the HParams Dashboard
Adapt TensorFlow runs to log hyperparameters and metrics; Start runs and log them all under one parent directory; Visualize the results in ...
Read more >
Tensorboard Only Producing Epoch Logs, Not Train/Val
Ask questionsTensorboard hparams plugin API reports metrics from final epoch not epoch with best performance on metric tracked via EarlyStopping.
Read more >
Changelog — PyTorch Lightning 1.8.6 documentation
Fixed epoch-end logging results not being reset after the end of the epoch (#14061) ... Added CPU metric tracking to DeviceStatsMonitor (#11795).
Read more >
Introduction to CallBacks in Tensorflow 2 - ML Hive
Tensorflow callbacks are very important to customize behaviour of ... Here is a basic example of callback using epoch end and training end....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found