question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hyperparameter logging in Tensorboard broken

See original GitHub issue

🐛 Bug

Calling self.save_hyperparameters() with a dictionary and TensorboardLogger does not log hyperparameters.

in pytorch_lightning/loggers/tensorboard.py the log_hyperparams method gets an object params which contains a dictionary with all hyperparameters. In line 207 there is a statement if metrics: which is required to actually write to the tensorboard. However, the log_hyperparams method never gets a metrics object and the default is None.

Traceback:

File "[/.../lib/python3.9/site-packages/pytorch_lightning/loggers/tensorboard.py]()", line 207, in log_hyperparams
    <PROBLEM OCCURCS HERE>
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/utilities/distributed.py]()", line 50, in wrapped_fn
    return fn(*args, **kwargs)
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/loggers/base.py]()", line 411, in log_hyperparams
    logger.log_hyperparams(params)
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 1259, in _log_hyperparams
    self.logger.log_hyperparams(hparams_initial)
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 1224, in _pre_dispatch
    self._log_hyperparams()
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 1188, in _run
    self._pre_dispatch()
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "[/.../lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py]()", line 740, in fit
    self._call_and_handle_interrupt(

Question: Why check if metrics in order to log hyperparameters? Replacing this with:

        if metrics:
            self.log_metrics(metrics, 0)
        
        exp, ssi, sei = hparams(params, metrics)
        writer = self.experiment._get_file_writer()
        writer.add_summary(exp)
        writer.add_summary(ssi)
        writer.add_summary(sei)

and calling logger.log_hyperparams manually using a tuple of dictionaries makes the keys of the hyperparameters appear, but not their values. Instead TB prints:

No hparams data was found.
Probable causes:

You haven’t written any hparams data to your event files.
Event files are still being loaded (try reloading this page).
TensorBoard can’t find your event files.
If you’re new to using TensorBoard, and want to find out how to add data and set up your event files, check out the [README](https://github.com/tensorflow/tensorboard/blob/master/README.md) and perhaps the [TensorBoard tutorial](https://www.tensorflow.org/get_started/summaries_and_tensorboard).

If you think TensorBoard is configured properly, please see [the section of the README devoted to missing data problems](https://github.com/tensorflow/tensorboard/blob/master/README.md#my-tensorboard-isnt-showing-any-data-whats-wrong) and consider filing an issue on GitHub.

To Reproduce

  1. Create a LightningModule
  2. call `self.save_hyperparameters({“A”: “1”, “B”: 2})

Expected behavior

Tensorboard should show A and B with their respective values.

Environment

* CUDA:
	- GPU:
	- available:         False
	- version:           None
* Packages:
	- numpy:             1.21.2
	- pyTorch_debug:     False
	- pyTorch_version:   1.10.0
	- pytorch-lightning: 1.5.10
	- tqdm:              4.63.1
* System:
	- OS:                Darwin
	- architecture:
		- 64bit
		- 
	- processor:         i386
	- python:            3.9.7
	- version:           Darwin Kernel Version 20.6.0: Tue Oct 12 18:33:42 PDT 2021; root:xnu-7195.141.8~1/RELEASE_X86_64

Additional context

the issue of invisible values is true for launching tensorboard with --load_fast=true and --load_fast=false

cc @awaelchli @edward-io @rohitgr7

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

github_iconTop Results From Across the Web

Tensorboard log broken while hyperparameter optimization #41
Hyperparameter optimization breaks the tensorboard logging. When it is active and multiple optimization jobs are running, all datapoints are ...
Read more >
Hyperparameter Tuning With TensorBoard In 6 Steps
To download the free dataset, just head to MachineHack and sign up. FEATURES: Name: The brand and model of the car. Location: The...
Read more >
Deep Dive Into TensorBoard: Tutorial With Examples
In this piece, we'll focus on TensorFlow's open-source visualization toolkit TensorBoard. The tool enables you to track various metrics such as accuracy and...
Read more >
TensorBoard Scalars: Logging training metrics in Keras
As training progresses, the Keras model will start logging data. TensorBoard will periodically refresh and show you your scalar metrics. If you ...
Read more >
Experiment Logging with TensorBoard and wandb
Not tracking hyperparameters or data changes is even worse than not saving your checkpoints. What good is a new state-of-the-art model snapshot ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found