`wandb.agent` generates many runs with 0s runtime
See original GitHub issuewandb --version && python --version && uname
- Weights and Biases version: 0.9.1
- Python version: 3.7.6
- Operating System: Windows 10 Build 19041
Description
Hello. This is my first time using W&B for one of my projects.
I was trying to set up a Sweep for a Scikit-Learn model, sklearn.svm.SVC
in a Jupyter notebook on JupyterLab.
jupyter lab --version
: 2.1.4
I set my routine up according to this example on sweeps in a notebook and this Scikit integration example.
What I Did
Here is the configuration I handed to wandb.sweep
:
sweep_config = {
'name': 'bayes_recall_fixed-split',
'method': 'bayes',
'program': 'train.py',
'metric': {'name': 'pr', 'goal': 'maximize'},
'parameters': {
'C': {
'min': 0.0,
'max': 1e4,
'distribution': 'normal',
'mu': 5e3,
'sigma': 1e3
},
'gamma': {
'min': 1.0,
'max': 100.0,
'distribution': 'uniform'
}
}
}
train.py
is in reference to the following routine:
import wandb
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
RANDOM_STATE = 4261998
LABELS = [1, 2, 3, 5, 7, 9, 11]
d_full = pd.read_csv('../data/d_full.csv')
x, y = *(d_full.drop('label', axis=1), # all columns except target
d_full['label']), # target column
# convert both to numpy arrays
x, y = map(lambda j: j.to_numpy(), (x,y))
FEATURES = d_full.drop('label', axis=1).columns
def train():
config_defaults = {
'C': 1.0,
'gamma': 10.0
}
wandb.init(config=config_defaults, magic=True)
x_train, y_train, x_test, y_test = train_test_split(x, y, test_size=0.1, shuffle=True, random_state=RANDOM_STATE)
model = SVC(C=wandb.config.C, gamma=wandb.config.gamma, probability=True)
model.fit(x_train, y_train)
y_pred = model.predict(x_test)
y_probas = model.predict_proba(x_test)
wandb.sklearn.plot_classifier(model, x_train, x_test,
y_train, y_test,
y_pred, y_probas,
labels=LABELS,
is_binary=False,
model_name='SVC',
feature_names=FEATURES)
wandb.log({'roc': wandb.plots.ROC(y_test, y_probas, labels=LABELS)})
wandb.log({'pr': wandb.plots.precision_recall(y_test, y_probas, plot_micro=True, labels=LABELS)})
Then, I get a sweep ID and start an agent:
sweep_id = wandb.sweep(sweep_config, project="sysfake")
wandb.agent(sweep_id)
This is where I started to get confused. It creates the sweep and initializes lots of different runs with values for the parameters I defined.
Create sweep with ID: 1il2b6xv
Sweep URL: https://app.wandb.ai/hsdicicco/sysfake/sweeps/1il2b6xv
wandb: Agent Starting Run: gud7rub8 with config:
C: 5274.774165576465
gamma: 66.74220210687481
wandb: Agent Starting Run: cs996x70 with config:
C: 5899.761282758936
gamma: 85.67701756918515
wandb: Agent Starting Run: dri2z1re with config:
C: 6873.888479886229
gamma: 40.92855993611387
wandb: Agent Starting Run: qvxsgo5v with config:
C: 3767.487013501218
gamma: 12.298612896326054
wandb: Agent Starting Run: jonb520l with config:
C: 4892.649168340187
gamma: 79.74140835180916
...
It seemed to me like it was generating more runs than I would consider normal. When I checked the sweep on the project page:
Checking the sweep table:
It says they are all running, but they all have 0s of runtime.
I’m aware that in a Bayesian-optimized search, W&B will continue running until manually stopped if a target is not defined. I’m also aware that enabling probability prediction on SVCs will make fitting expensive, as it will apply Platt’s Calibration internally.
I suspect I might just be impatient, but this seemed odd to me. Should I just let this run? Have I done something wrong?
Thanks in advance.
Update: 7/9/2020
I checked the sweep again this afternoon, and it seems that all of the runs have crashed:
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top GitHub Comments
Hi there, we’ll try to recreate and let you know when we find the issue. Meanwhile I would recommend updating your wandb version.
I must also add that if I choose a crashed run from the sweep that I referenced and check the logs they appear to be empty.