Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error communicating with backend

See original GitHub issue

Hi,

Sorry for creating this as a new issue, but technically it is. I added a comment to #1287 about this, but since it is a different problem, I thought it would be best to track it in a new issue. Copying the comment from the previous thread:

I am on version 0.10.4 of the client and I faced a similar error which I’m guessing is network error. It happened on one of the multiple similar runs.

From what I can tell, it seems something went wrong with login/init. Can the client circumvent this without crashing? Something simple I can think of is just allowing the user to increase the timeout, so the client just keeps polling the backend till it connects, rather than stopping the run. I’m not sure if this has problems I haven’t thought about.

Thanks!

Here’s the stack trace:

wandb: ERROR Error communicating with backend
Traceback (most recent call last):
  File "runners/mme/mme.py", line 64, in <module>
    config=args, reinit=True, project=project)
  File "/home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py", line 398, in init
    wandb.init(*args, **kwargs)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 460, in init
    run = wi.init()
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 378, in init
    raise UsageError(error_message)
wandb.errors.error.UsageError: Error communicating with backend

Issue Analytics

State:
Created 3 years ago
Comments:48 (14 by maintainers)

Top GitHub Comments

16reactions

issue-label-bot[bot]commented, Oct 23, 2020

Issue-Label Bot is automatically applying the label bug to this issue, with a confidence of 0.67. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

5reactions

chrischoycommented, Nov 7, 2020

Hmm I was wrong, I got this error again with 0.10.8

2020-11-07 01:04:34,387 INFO    MainThread:65675 [internal.py:wandb_internal():62] W&B internal server running at pid: 65675
2020-11-07 01:04:34,389 INFO    WriterThread:65675 [datastore.py:open_for_write():76] open: wandb/run-20201107_010319-17yifxe3/run-17yifxe3.wandb
2020-11-07 01:04:34,390 DEBUG   SenderThread:65675 [sender.py:send():89] send: header
2020-11-07 01:04:34,390 DEBUG   HandlerThread:65675 [handler.py:handle_request():54] handle_request: check_version
2020-11-07 01:04:34,391 DEBUG   HandlerThread:65675 [handler.py:handle_request():54] handle_request: shutdown
2020-11-07 01:04:34,392 DEBUG   SenderThread:65675 [sender.py:send():89] send: request
2020-11-07 01:04:34,392 DEBUG   SenderThread:65675 [sender.py:send_request():98] send_request: check_version
2020-11-07 01:04:34,392 INFO    HandlerThread:65675 [handler.py:finish():267] shutting down handler
2020-11-07 01:04:34,398 DEBUG   Thread-4  :65675 [connectionpool.py:_new_conn():939] Starting new HTTPS connection (1): pypi.org:443
2020-11-07 01:04:34,446 DEBUG   Thread-4  :65675 [connectionpool.py:_make_request():433] https://pypi.org:443 "GET /pypi/wandb/json HTTP/1.1" 200 51383
2020-11-07 01:04:34,457 INFO    SenderThread:65675 [sender.py:finish():608] shutting down sender
2020-11-07 01:04:35,392 INFO    WriterThread:65675 [datastore.py:close():257] close: wandb/run-20201107_010319-17yifxe3/run-17yifxe3.wandb
2020-11-07 01:04:35,393 INFO    MainThread:65675 [internal.py:handle_exit():137] Internal process exited

2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_init.py:_log_setup():293] Logging user logs to wandb/run-20201107_010319-17yifxe3/logs/debug.log
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_init.py:_log_setup():294] Logging internal logs to wandb/run-20201107_010319-17yifxe3/logs/debug-internal.log
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] setting env: {}
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] setting user settings: {'save_code': False, 'email': '--------@gmail.com'}
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] multiprocessing start_methods=fork,spawn,forkserver
2020-11-07 01:04:35,457 INFO    MainThread:65105 [wandb_init.py:teardown():154] tearing down wandb.init

After some debugging, I found that

wandb.init(...) works, but when I use the pytorch_lightning.loggers.Wandb(...).experiment it fails, giving the same Error communicating with backend.

This might be related to some hard coded arguments in the pytorch_lightning wandb.init call.

import pytorch_lightning as pl
import wandb

db = pl.loggers.WandbLogger(name='new_test', project='test')
# Following line sometimes fails
print(db.experiment)

# Copy of https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/loggers/wandb.py#L125
# It also fails sometimes
wandb.init(name=db._name, dir=db._save_dir, project=db._project, anonymous=db._anonymous, reinit=True, id=db._id, resume='allow', **db._kwargs)

I ended up using the following script. One could inherit the logger, but I’ll just initialize _experiment variable in pl wandblogger.

import time

logger = pl.loggers.WandbLogger(name='new_test', project='test')
while True:
    try:
        logger._experiment = wandb.init(name=logger._name, project=logger._project)
        break
    except:
        print("Retrying")
        time.sleep(10)

....
# works fine

If it fails, it successfully initializes on the second try.

Top Results From Across the Web

Error communicating with backend · Issue #1409 · wandb/ ...

This could be caused by logging in with a key that does not have access to the project you're attempting to log to....

Error communicating with the backend services

The installer has created a zip, unpacked but upon set up I get an error message 'Error communicating with the backend services'.

Error while communicating with the backend system

Hi All, SRM 5.0 / R/3 4.6c I get this error ( Error while communicating with the backedn system- inform system admin). when...

Resolve the "Network Error communicating with endpoint ...

I want to resolve the "Network Error communicating with endpoint" error in Amazon API Gateway. Short description. If the number of API requests ......

PTV Group - Error communicating with backend module m0026

Re: Error communicating with backend module m0026 ... Please provide a sample request via email (bernd.welter@ptvgroup.com) email. Is the behaviour deterministic?