question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error communicating with backend

See original GitHub issue

Hi,

Sorry for creating this as a new issue, but technically it is. I added a comment to #1287 about this, but since it is a different problem, I thought it would be best to track it in a new issue. Copying the comment from the previous thread:

I am on version 0.10.4 of the client and I faced a similar error which I’m guessing is network error. It happened on one of the multiple similar runs.

From what I can tell, it seems something went wrong with login/init. Can the client circumvent this without crashing? Something simple I can think of is just allowing the user to increase the timeout, so the client just keeps polling the backend till it connects, rather than stopping the run. I’m not sure if this has problems I haven’t thought about.

Thanks!

Here’s the stack trace:

wandb: ERROR Error communicating with backend
Traceback (most recent call last):
  File "runners/mme/mme.py", line 64, in <module>
    config=args, reinit=True, project=project)
  File "/home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py", line 398, in init
    wandb.init(*args, **kwargs)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 460, in init
    run = wi.init()
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 378, in init
    raise UsageError(error_message)
wandb.errors.error.UsageError: Error communicating with backend

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:48 (14 by maintainers)

github_iconTop GitHub Comments

16reactions
issue-label-bot[bot]commented, Oct 23, 2020

Issue-Label Bot is automatically applying the label bug to this issue, with a confidence of 0.67. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

5reactions
chrischoycommented, Nov 7, 2020

Hmm I was wrong, I got this error again with 0.10.8

2020-11-07 01:04:34,387 INFO    MainThread:65675 [internal.py:wandb_internal():62] W&B internal server running at pid: 65675
2020-11-07 01:04:34,389 INFO    WriterThread:65675 [datastore.py:open_for_write():76] open: wandb/run-20201107_010319-17yifxe3/run-17yifxe3.wandb
2020-11-07 01:04:34,390 DEBUG   SenderThread:65675 [sender.py:send():89] send: header
2020-11-07 01:04:34,390 DEBUG   HandlerThread:65675 [handler.py:handle_request():54] handle_request: check_version
2020-11-07 01:04:34,391 DEBUG   HandlerThread:65675 [handler.py:handle_request():54] handle_request: shutdown
2020-11-07 01:04:34,392 DEBUG   SenderThread:65675 [sender.py:send():89] send: request
2020-11-07 01:04:34,392 DEBUG   SenderThread:65675 [sender.py:send_request():98] send_request: check_version
2020-11-07 01:04:34,392 INFO    HandlerThread:65675 [handler.py:finish():267] shutting down handler
2020-11-07 01:04:34,398 DEBUG   Thread-4  :65675 [connectionpool.py:_new_conn():939] Starting new HTTPS connection (1): pypi.org:443
2020-11-07 01:04:34,446 DEBUG   Thread-4  :65675 [connectionpool.py:_make_request():433] https://pypi.org:443 "GET /pypi/wandb/json HTTP/1.1" 200 51383
2020-11-07 01:04:34,457 INFO    SenderThread:65675 [sender.py:finish():608] shutting down sender
2020-11-07 01:04:35,392 INFO    WriterThread:65675 [datastore.py:close():257] close: wandb/run-20201107_010319-17yifxe3/run-17yifxe3.wandb
2020-11-07 01:04:35,393 INFO    MainThread:65675 [internal.py:handle_exit():137] Internal process exited
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_init.py:_log_setup():293] Logging user logs to wandb/run-20201107_010319-17yifxe3/logs/debug.log
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_init.py:_log_setup():294] Logging internal logs to wandb/run-20201107_010319-17yifxe3/logs/debug-internal.log
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] setting env: {}
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] setting user settings: {'save_code': False, 'email': '--------@gmail.com'}
2020-11-07 01:03:19,899 INFO    MainThread:65105 [wandb_setup.py:_flush():69] multiprocessing start_methods=fork,spawn,forkserver
2020-11-07 01:04:35,457 INFO    MainThread:65105 [wandb_init.py:teardown():154] tearing down wandb.init                                                                                                        

After some debugging, I found that

wandb.init(...) works, but when I use the pytorch_lightning.loggers.Wandb(...).experiment it fails, giving the same Error communicating with backend.

This might be related to some hard coded arguments in the pytorch_lightning wandb.init call.

import pytorch_lightning as pl
import wandb

db = pl.loggers.WandbLogger(name='new_test', project='test')
# Following line sometimes fails
print(db.experiment)

# Copy of https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/loggers/wandb.py#L125
# It also fails sometimes
wandb.init(name=db._name, dir=db._save_dir, project=db._project, anonymous=db._anonymous, reinit=True, id=db._id, resume='allow', **db._kwargs)

I ended up using the following script. One could inherit the logger, but I’ll just initialize _experiment variable in pl wandblogger.

import time

logger = pl.loggers.WandbLogger(name='new_test', project='test')
while True:
    try:
        logger._experiment = wandb.init(name=logger._name, project=logger._project)
        break
    except:
        print("Retrying")
        time.sleep(10)

....
# works fine

If it fails, it successfully initializes on the second try.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error communicating with backend · Issue #1409 · wandb/ ...
This could be caused by logging in with a key that does not have access to the project you're attempting to log to....
Read more >
Error communicating with the backend services
The installer has created a zip, unpacked but upon set up I get an error message 'Error communicating with the backend services'.
Read more >
Error while communicating with the backend system
Hi All, SRM 5.0 / R/3 4.6c I get this error ( Error while communicating with the backedn system- inform system admin). when...
Read more >
Resolve the "Network Error communicating with endpoint ...
I want to resolve the "Network Error communicating with endpoint" error in Amazon API Gateway. Short description. If the number of API requests ......
Read more >
PTV Group - Error communicating with backend module m0026
Re: Error communicating with backend module m0026 ... Please provide a sample request via email (bernd.welter@ptvgroup.com) email. Is the behaviour deterministic?
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found