tensorboard hyperparameters don't update
See original GitHub issueMWE:
- Run code
- start
tensorboard --logdir=lightning_logs
in same directory - Go to HPARAMS in website
- See only
layer_1_dim
Expected behavior:
- Run code
- start
tensorboard --logdir=lightning_logs
in same directory - Go to HPARAMS in website
- See
layer_1_dim
andanother_hyperparameter
- but
another_hyperparameter
empty in version0
- but
Solved:
- Run second code. The trick is to call the net with all hyperparameters first, then tensorboard gets
another_hyperparameter
Sadly I have only little knowledge of tensorboard and do not know what to search for - maybe there is an option to set this, then I am very sorry but would appreciate a hint. Maybe this is also more an issue for pytorch-lightning but I just do not know. Best wishes
import pytorch_lightning as pl
from argparse import ArgumentParser
import torch
class LitMNIST(pl.LightningModule):
def __init__(self, hparams):
super(LitMNIST, self).__init__()
self.hparams = hparams
self.layer_1 = torch.nn.Linear(28 * 28, self.hparams.layer_1_dim)
def forward(self, *args, **kwargs):
pass
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('--layer_1_dim', type=int, default=10)
args = parser.parse_args()
# print(args)
## > Namespace(layer_1_dim=10)
model = LitMNIST(hparams=args)
trainer = pl.Trainer()
try:
trainer.fit(model)
except:
pass
parser = ArgumentParser()
parser.add_argument('--layer_1_dim', type=int, default=10)
parser.add_argument('--another_hyperparameter', type=int, default=10)
args = parser.parse_args()
# print(args)
## > Namespace(another_hyperparameter=10, layer_1_dim=10)
model = LitMNIST(hparams=args)
trainer = pl.Trainer()
try:
trainer.fit(model)
except:
pass
Changed: First call net with both parameters
import pytorch_lightning as pl
from argparse import ArgumentParser
import torch
class LitMNIST(pl.LightningModule):
def __init__(self, hparams):
super(LitMNIST, self).__init__()
self.hparams = hparams
self.layer_1 = torch.nn.Linear(28 * 28, self.hparams.layer_1_dim)
def forward(self, *args, **kwargs):
pass
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('--layer_1_dim', type=int, default=10)
parser.add_argument('--another_hyperparameter', type=int, default=10)
args = parser.parse_args()
# print(args)
## > Namespace(another_hyperparameter=10, layer_1_dim=10)
model = LitMNIST(hparams=args)
trainer = pl.Trainer()
try:
trainer.fit(model)
except:
pass
parser = ArgumentParser()
parser.add_argument('--layer_1_dim', type=int, default=10)
args = parser.parse_args()
# print(args)
## > Namespace(layer_1_dim=10)
model = LitMNIST(hparams=args)
trainer = pl.Trainer()
try:
trainer.fit(model)
except:
pass
Issue Analytics
- State:
- Created 4 years ago
- Comments:8
Top Results From Across the Web
Why does tensorboard not show all metrics? - Stack Overflow
Try to restart tensorboard. Tensorboard seems to have an issue reliably detecting new scalar values for the 'HPARAMS' section.
Read more >Deep Dive Into TensorBoard: Tutorial With Examples
Hyperparameter tuning with TensorBoard The dashboard is available under the HPARAMS tab. To achieve this you have to clear the previous logs and...
Read more >Tune hyperparameters in your custom training loop - Keras
First, we import the libraries we need, and we create datasets for training and validation. Here, we just use some random data for...
Read more >Easy Hyperparameter Tuning with Keras Tuner and TensorFlow
To learn how to tune hyperparameters with Keras Tuner, just keep reading. ... stop, and resume hyperparameter tuning experiments.
Read more >Stop Training Jobs Early - Amazon SageMaker
Stop the training jobs that a hyperparameter tuning job launches early when they are not improving significantly as measured by the objective metric....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I would like to catch up on this issue. I am not fully convinced that this is an issue on the
torch.utils.tensorboard
side. If I have 2 independent runs with different summary writer instances and each run logs to a different directory (.e.g~/001
and~/002
, then I can point tensorboard to each of the logdirs and see the full set of hyperparameters, respectively. Now I want to compare both runs in a single view, so I point tensorboard to the parent dir, namely~/
. If I check the hparams view again I am left with only the union of both hyperparameter sets. All hyperparameters that are unique to one of the runs are not shown anymore.To me that sounds like, individually everything is logged properly via
torch.utils.tensorboard
. But when starting tensorboard, it can not properly iterate over all event files and build the complete hyperparameter table.Any thoughts on that? Or am I missing something? If this is a different issue I am also happy to oben a new issue for it.
*edit: After searching a bit more, it seems my problem is related to: https://github.com/tensorflow/tensorboard/issues/2942
This looks like it’s an issue in the PyTorch TensorBoard SummaryWriter implementation which is maintained by PyTorch, not us, where their API only supports writing all hparams in a single shot. I’d recommend following up at https://github.com/pytorch/pytorch/issues/39250.