Sweeps not initializing properly with PyTorch Lightening
See original GitHub issuewandb --version && python --version && uname
wandb, version 0.8.36
Python 3.7.6
Linux
Description
I’m trying to initialize a sweep using the WandB Logger for PyTorch Lightening. I’m following the keras example in ‘Intro to Hyperparameter Sweeps with W&B.ipynb’. I’m running it in jupyter on my own machine.
Basic problem: nothing gets loggeed to wandb when I run the sweep. Notable feature: when I start the sweep it initialized a new hyperparameter config and starts a new run. But then it initializes another run. Nothing gets logged to either of them.
Individual runs are fine.
What I Did
sweep_config = {
'method': 'random', #grid, random
'metric': {
'name': 'val_loss',
'goal': 'minimize'
},
'parameters': {
'lr': {
'min': 1e-4,
'max': 1e-1
},
}
}
sweep_id = wandb.sweep(sweep_config, entity="user", project="project-name")
Then I specify the training function:
wandb_logger = WandbLogger()
def train():
config_defaults = {
'epochs': 5,
'bs': 64,
'lr': 1e-3,
'seed': 42
}
wandb.init(config=config_defaults)
config = config_defaults
hparams = Namespace(
lr = config['lr'],
bs = config['bs']
)
wandb_logger.log_hyperparams(hparams)
model = AutoEncoder(hparams)
trainer = pl.Trainer(
logger=wandb_logger,
max_epochs=config['epochs'])
trainer.fit(model)
I then call the sweep
wandb.agent(sweep_id, train)
and get the following output at the start:
INFO:wandb.wandb_agent:Running runs: []
INFO:wandb.wandb_agent:Agent received command: run
INFO:wandb.wandb_agent:Agent starting run with config:
lr: 0.020506108917114917
wandb: Agent Starting Run: 59fu3sst with config:
lr: 0.020506108917114917
wandb: Agent Started Run: 59fu3sst
Logging results to Weights & Biases (Documentation).
Project page: https://app.wandb.ai/user/proj-name
Sweep page: https://app.wandb.ai/user/proj-name/sweeps/20thclh6
Run page: https://app.wandb.ai/user/proj-name/runs/59fu3sst
INFO:wandb.run_manager:system metrics and metadata threads started
INFO:wandb.run_manager:checking resume status, waiting at most 10 seconds
INFO:wandb.run_manager:resuming run from id: UnVuOnYxOjU5ZnUzc3N0OmVmZi1kaW0tcmVkLXByb2plY3Q6bGJyYW5uaWdhbg==
INFO:wandb.run_manager:upserting run before process can begin, waiting at most 10 seconds
INFO:wandb.run_manager:saving pip packages
INFO:wandb.run_manager:initializing streaming files api
INFO:wandb.run_manager:unblocking file change observer, beginning sync with W&B servers
Logging results to Weights & Biases (Documentation).
Project page: https://app.wandb.ai/user/proj-name
Run page: https://app.wandb.ai/user/proj-name/runs/xv9xywx7
So it starts the run and gives the sweeps page, but then seems to initialise a new run. There’s no additional wandb code in the model, it’s a standard PTL set-up.
Any suggestions?
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
Hi,
These issues should now be solved.
Here are some examples for running sweeps with pytorch-lightning:
Let me know if you still run into any issue.
Actually you can log hyper-parameters with this object through
run.config.update(dict)
.