question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sweep is launching runs as individual runs instead of attaching them to sweep

See original GitHub issue
  • Weights and Biases version: 0.8.19
  • Python version: 3.6.9
  • Operating System: Linux

Description

I’m attempting to run a sweep over hyperparameters but the jobs are launching as individual runs instead of as part of the sweep. Can you describe what conditions must hold for jobs to run as part of a sweep?

What I Did

The approximate structure of my workflow is this:

train.py:

  • sweep_id = wandb.sweep(sweep_config)
  • wandb.agent(sweep_id, function=train_sweep(hparams, run_config, hparams_file, sweep_config))``` where train_sweep is a function which calls a model definition in model.py as below.

model.py

class Model:
wandb.init(config=config_defaults)
config = wandb.config
update model params by referring to config.param for each param
model.train()

It seems that the init in model.py is starting up a new context instead of attaching to the sweep run even though it is started with the Wandb agent, how can I fix this? Is there some way for me to “pass” the correct W&B sweep context to the model class instantiation in the other file?

Also, is there a way to instantiate a sweep without manually defining the defaults? i.e. have just it pick one from the given sweep config.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:13 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
mathemakittencommented, Jan 9, 2020

Actually, do you have any examples of using W+B with workflows where sweeps get started up and launch a run on a new TPU? I think I’m having issues between the W+B sweep agent context and the way I’m connecting to the TPU (i.e. with TF functions like tf.config.experimental_connect_to_cluster). For some reason it works OK in single-run mode though.

1reaction
raubitsjcommented, Jan 9, 2020

I think the issue is with:

wandb.agent(sweep_id, function=train_sweep(hparams, model, train_dataset_fn, eval_dataset_fn))

function argument to wandb.agent() expects a function so unless train_sweep is returning a function you might want to do:

wandb.agent(sweep_id, function=lambda: train_sweep(hparams, model, train_dataset_fn, eval_dataset_fn))
Read more comments on GitHub >

github_iconTop Results From Across the Web

Sweep is launching runs as individual runs instead of ... - GitHub
It seems that the init in model.py is starting up a new context instead of attaching to the sweep run even though it...
Read more >
ADE Assembler Message 1921 - When running multiple ...
I can see that your 5 sweep simulations remain in the "pending" state. Hence, it appears there has to be something in you...
Read more >
Servo sweep control question. - Arduino Forum
I have a servo attached and have been playing with the sweep program, changing start positions, angle of movement and speeds.
Read more >
The Power of the Batch Sweep | COMSOL Blog
With batch sweeps, you can get solutions for a parametric sweep during the solution process. See how to use this powerful simulation tool....
Read more >
The Jet Sweep Play: Everything You Need to Know
The jet sweep is a play where a receiver or running back comes in motion and takes the handoff from the quarterback as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found