question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`wandb.agent` generates many runs with 0s runtime

See original GitHub issue

wandb --version && python --version && uname

  • Weights and Biases version: 0.9.1
  • Python version: 3.7.6
  • Operating System: Windows 10 Build 19041

Description

Hello. This is my first time using W&B for one of my projects.

I was trying to set up a Sweep for a Scikit-Learn model, sklearn.svm.SVC in a Jupyter notebook on JupyterLab. jupyter lab --version: 2.1.4

I set my routine up according to this example on sweeps in a notebook and this Scikit integration example.

What I Did

Here is the configuration I handed to wandb.sweep:

sweep_config = {
  'name': 'bayes_recall_fixed-split',
  'method': 'bayes',
  'program': 'train.py',
  'metric': {'name': 'pr', 'goal': 'maximize'},
  'parameters': {
        'C': {
            'min': 0.0,
            'max': 1e4,
            'distribution': 'normal',
            'mu': 5e3,
            'sigma': 1e3
        },
       'gamma': {
           'min': 1.0,
           'max': 100.0,
           'distribution': 'uniform'
       }
    }
}

train.py is in reference to the following routine:

import wandb
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

RANDOM_STATE = 4261998

LABELS = [1, 2, 3, 5, 7, 9, 11]

d_full = pd.read_csv('../data/d_full.csv')
x, y = *(d_full.drop('label', axis=1), # all columns except target
            d_full['label']), # target column
# convert both to numpy arrays
x, y = map(lambda j: j.to_numpy(), (x,y))

FEATURES = d_full.drop('label', axis=1).columns

def train():
    config_defaults = {
        'C': 1.0,
        'gamma': 10.0
    }
    wandb.init(config=config_defaults, magic=True)
    x_train, y_train, x_test, y_test = train_test_split(x, y, test_size=0.1, shuffle=True, random_state=RANDOM_STATE)
    model = SVC(C=wandb.config.C, gamma=wandb.config.gamma, probability=True)
    model.fit(x_train, y_train)
    y_pred = model.predict(x_test)
    y_probas = model.predict_proba(x_test)
    wandb.sklearn.plot_classifier(model, x_train, x_test,
                                  y_train, y_test,
                                  y_pred, y_probas,
                                  labels=LABELS,
                                  is_binary=False,
                                  model_name='SVC',
                                  feature_names=FEATURES)
    wandb.log({'roc': wandb.plots.ROC(y_test, y_probas, labels=LABELS)})
    wandb.log({'pr': wandb.plots.precision_recall(y_test, y_probas, plot_micro=True, labels=LABELS)})

Then, I get a sweep ID and start an agent:

sweep_id = wandb.sweep(sweep_config, project="sysfake")
wandb.agent(sweep_id)

This is where I started to get confused. It creates the sweep and initializes lots of different runs with values for the parameters I defined.

Create sweep with ID: 1il2b6xv
Sweep URL: https://app.wandb.ai/hsdicicco/sysfake/sweeps/1il2b6xv

wandb: Agent Starting Run: gud7rub8 with config:
	C: 5274.774165576465
	gamma: 66.74220210687481
wandb: Agent Starting Run: cs996x70 with config:
	C: 5899.761282758936
	gamma: 85.67701756918515
wandb: Agent Starting Run: dri2z1re with config:
	C: 6873.888479886229
	gamma: 40.92855993611387
wandb: Agent Starting Run: qvxsgo5v with config:
	C: 3767.487013501218
	gamma: 12.298612896326054
wandb: Agent Starting Run: jonb520l with config:
	C: 4892.649168340187
	gamma: 79.74140835180916
...

It seemed to me like it was generating more runs than I would consider normal. When I checked the sweep on the project page: image

Checking the sweep table: image

It says they are all running, but they all have 0s of runtime.

I’m aware that in a Bayesian-optimized search, W&B will continue running until manually stopped if a target is not defined. I’m also aware that enabling probability prediction on SVCs will make fitting expensive, as it will apply Platt’s Calibration internally.

I suspect I might just be impatient, but this seemed odd to me. Should I just let this run? Have I done something wrong?

Thanks in advance.

Update: 7/9/2020

I checked the sweep again this afternoon, and it seems that all of the runs have crashed:

image

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
tyomhakcommented, Jul 10, 2020

Hi there, we’ll try to recreate and let you know when we find the issue. Meanwhile I would recommend updating your wandb version.

1reaction
dicicchcommented, Jul 11, 2020

I must also add that if I choose a crashed run from the sweep that I referenced and check the logs they appear to be empty. image

Read more comments on GitHub >

github_iconTop Results From Across the Web

wandb.agent generates many runs with 0s runtime #1149
As I was out of storage, runs were being canceled immediately as they were initiated. This caused the agents to generate a lot...
Read more >
FAQ - Documentation - Weights & Biases - Wandb
Can I rerun a grid search? ... Yes. If you exhaust a grid search but want to re-execute some of the W&B Runs...
Read more >
wandb.Run - Documentation - Weights & Biases
This is used when creating multiple runs in the same process. We automatically call this method when your script exits or if you...
Read more >
wandb.agent - Documentation - Weights & Biases
Generic agent entrypoint, used for CLI or jupyter. ... Will run a function or program with configuration parameters specified by server.
Read more >
W&B Integration Best Practices – Weights & Biases - Wandb
Publish your model insights with interactive plots for performance metrics, predictions, and hyperparameters. Made by Ken Lee using Weights & Biases.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found