[CLI]: Connection issues with local (Docker) setup and Optuna multi-threading
See original GitHub issueDescribe the bug
Trying to combine Optuna and a Docker wandb instance, I get connection issues when using multi-threading. Removing the n_jobs=2 parameter from optuna.create_study() makes the issue go away. (Remoing the wandb-related code will also get rid of the errors, even with n_jobs enabled). I hope there is a way to use wandb and still run a parallelized study with Optuna.
Note that both wandb server and the client run on the same machine, just in different Docker containers. Thus, real network issues should not be the issue.
import wandb
import optuna
import numpy as np
def objective(trial: optuna.Trial) -> float:
config = dict(trial.params)
config["trial.number"] = trial.number
wandb.init(
project="my_project",
group="wandb_mwe",
config=config,
reinit=True,
# settings=wandb.Settings(start_method="fork"), # does not make a difference
# settings=wandb.Settings(start_method="thread"), # does not make a difference
)
score = np.random.random(1)
wandb.run.summary["final score"] = score
wandb.run.summary["state"] = "completed"
wandb.finish(quiet=True)
return score
wandb.login(
key="",
host="http://my.local.net:8000",
# host="http://wandb-docker-container:8000", # not a valid url
relogin=True,
)
study = optuna.create_study(study_name="my_study", direction="maximize")
study.optimize(objective, n_trials=5, n_jobs=2)
$ /opt/conda/bin/python /workspaces/my_project/src/wandb_errors_mwe.py
wandb: Appending key for my.local.net to your netrc file: /home/vscode/.netrc
[I 2022-08-18 11:27:58,400] A new study created in memory with name: my_study
/opt/conda/lib/python3.9/site-packages/optuna/study/study.py:393: FutureWarning: `n_jobs` argument has been deprecated in v2.7.0. This feature will be removed in v4.0.0. See https://github.com/optuna/optuna/releases/tag/v2.7.0.
warnings.warn(
wandb: Currently logged in as: username. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.13.1
wandb: Run data is saved locally in /workspaces/my_project/src/wandb/run-20220818_112758-r6nbia7o
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run spring-cloud-23
wandb: ⭐️ View project at http://my.local.net:8080/username/my_project
wandb: 🚀 View run at http://my.local.net:8080/username/my_project/runs/r6nbia7o
wandb: Waiting for W&B process to finish... (success).
wandb:
wandb: Synced spring-cloud-23: http://my.local.net:8080/username/my_project/runs/r6nbia7o
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 1 other file(s)
[I 2022-08-18 11:28:10,043] Trial 0 finished with value: 0.07086374553871444 and parameters: {}. Best is trial 0 with value: 0.07086374553871444.
wandb: Tracking run with wandb version 0.13.1
wandb: Run data is saved locally in /workspaces/my_project/src/wandb/run-20220818_112810-132mjr0f
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run kind-microwave-24
wandb: ⭐️ View project at http://my.local.net:8080/username/my_project
wandb: 🚀 View run at http://my.local.net:8080/username/my_project/runs/132mjr0f
wandb: Waiting for W&B process to finish... (success).
wandb:
wandb: Synced kind-microwave-24: http://my.local.net:8080/username/my_project/runs/132mjr0f
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 1 other file(s)
[I 2022-08-18 11:28:23,610] Trial 2 finished with value: 0.42732459058893624 and parameters: {}. Best is trial 2 with value: 0.42732459058893624.
Problem at: /workspaces/my_project/src/wandb_errors_mwe.py 10 objective
wandb: ERROR Error communicating with wandb process
wandb: ERROR For more info see: https://docs.wandb.ai/library/init#init-start-error
[W 2022-08-18 11:28:28,455] Trial 1 failed because of the following error: UsageError('Error communicating with wandb process\nFor more info see: https://docs.wandb.ai/library/init#init-start-error')
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/workspaces/my_project/src/wandb_errors_mwe.py", line 10, in objective
wandb.init(
File "/opt/conda/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 1043, in init
run = wi.init()
File "/opt/conda/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 691, in init
raise UsageError(error_message)
wandb.errors.UsageError: Error communicating with wandb process
For more info see: https://docs.wandb.ai/library/init#init-start-error
wandb: Tracking run with wandb version 0.13.1
wandb: Run data is saved locally in /workspaces/my_project/src/wandb/run-20220818_112823-2h3zkj2i
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run serene-frost-25
wandb: ⭐️ View project at http://my.local.net:8080/username/my_project
wandb: 🚀 View run at http://my.local.net:8080/username/my_project/runs/2h3zkj2i
wandb: Waiting for W&B process to finish... (success).
wandb:
wandb: Synced serene-frost-25: http://my.local.net:8080/username/my_project/runs/2h3zkj2i
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 1 other file(s)
[I 2022-08-18 11:28:36,211] Trial 3 finished with value: 0.5581329015664727 and parameters: {}. Best is trial 3 with value: 0.5581329015664727.
Traceback (most recent call last):
File "/workspaces/my_project/src/wandb_errors_mwe.py", line 38, in <module>
study.optimize(objective, n_trials=5, n_jobs=2)
File "/opt/conda/lib/python3.9/site-packages/optuna/study/study.py", line 400, in optimize
_optimize(
File "/opt/conda/lib/python3.9/site-packages/optuna/study/_optimize.py", line 106, in _optimize
f.result()
File "/opt/conda/lib/python3.9/concurrent/futures/_base.py", line 438, in result
return self.__get_result()
File "/opt/conda/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/opt/conda/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/conda/lib/python3.9/site-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File "/opt/conda/lib/python3.9/site-packages/optuna/study/_optimize.py", line 264, in _run_trial
raise func_err
File "/opt/conda/lib/python3.9/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/workspaces/my_project/src/wandb_errors_mwe.py", line 10, in objective
wandb.init(
File "/opt/conda/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 1043, in init
run = wi.init()
File "/opt/conda/lib/python3.9/site-packages/wandb/sdk/wandb_init.py", line 691, in init
raise UsageError(error_message)
wandb.errors.UsageError: Error communicating with wandb process
For more info see: https://docs.wandb.ai/library/init#init-start-error
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing.
wandb:
wandb: Synced 2 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20220818_112758-2b67cjb2/logs
Additional Files
Environment
WandB version: 0.13.1
OS: Linux (running in VSCode dev container)
Python version: 3.9.6
Versions of relevant libraries: optuna: 2.10.1
Additional Context
I have based my MWE on this post: https://medium.com/optuna/optuna-meets-weights-and-biases-58fc6bab893
Issue Analytics
- State:
- Created a year ago
- Comments:8

Top Related StackOverflow Question
Hi @mimxrt thank you for attaching the wandb folder, we will try to reproduce this error and get back to you soon with more information.
Thank you for the update @mimxrt.