question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support parallel sweeps with local controller

See original GitHub issue

wandb --version && python --version && uname

  • Weights and Biases version: 0.8.18
  • Python version: 3.7.4
  • Operating System: Linux-4.4.0-1090-aws-x86_64-with-debian-stretch-sid

Description

I was trying to run multiple sweeps in parallel using a local controller. The documentation isn’t very clear, so I thought it was supported, with the only caveat that the controller would only schedule a new run once the scheduled queue was empty.

Based on the comments from wandb_controller I was surprised that only one of my workers were running and all others were idle. Maybe the definition of “scheduled” leaves room for confusion?

    Protocols:
        Scheduling a run:
        - client controller adds a schedule entry on the controller.schedule list
        - cloud backend notices the new entry and creates a run with the parameters
        - cloud backend adds a scheduled entry on the scheduler.scheduled list
        - client controller notices that the run has been scheduled and removes it from
          controller.schedule list
    Current implementation details:
        - Runs are only schedule if there are no other runs scheduled.

What I Did

I defined the parameters for my local sweep in a YAML file:

program: run.py
method: bayes
metric:
  name: mape
  goal: minimize
controller:
  type: local
parameters:
  days:
    min: 1
    max: 7
  smoothing:
    min: 0.0
    max: 0.995

and ran from the command line:

$ wandb sweep --controller --verbose sweep.yaml

PS: The problem goes away when you run the cloud version of the sweep. I understand this may not be an urgent feature, but it would be great to at least make it clear that local sweeps do not support parallel agents for now.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
issue-label-bot[bot]commented, Dec 16, 2019

Issue-Label Bot is automatically applying the label enhancement to this issue, with a confidence of 0.72. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

0reactions
Chen-Cai-OSUcommented, Aug 27, 2021

I also need this feature to run several jobs (each one has low gpu utilization) within one sweep. It will be a very nice feature if wandb can include it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Support parallel sweeps with local controller #716 - GitHub
I was trying to run multiple sweeps in parallel using a local controller. The documentation isn't very clear, so I thought it was...
Read more >
Search and stop algorithms locally - Weights & Biases - Wandb
Search and stop algorithms locally instead of using the Weights & Biases cloud-hosted service. The hyper-parameter controller is hosted by Weights & Biased ......
Read more >
Local Controller · GitBook
Run search and stopping algorithms locally, instead of using our cloud-hosted service.
Read more >
c++ - Threading: Most efficient way for many repeated parallel ...
The general solution to this problem is to create the threads and distribute the work only once, and then use fast synchronization point...
Read more >
Parameter Sweeps and Sweep Plans - ADS 2009
The SweepPlan controller provides more flexibility. It supports more ranges, and a sweep plan can include a single point along with sweeps ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found