question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Couldn't communicate with backend after 15 seconds

See original GitHub issue

Hi,

My run exited with the error message in the title. What could this be from? Is it probably just a network error? In that case, I don’t think wandb should crash the entire run with no notification.

Here’s the full trace

Ignoring settings passed to wandb.setup() which has already been configured.
Problem at: /home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py 376 init
Traceback (most recent call last):
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 477, in init
    run = wi.init()
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 356, in init
    _backend=backend, _disable_warning=True, _settings=self.settings
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 102, in _login
    res = _backend.interface.communicate_login(key, anonymous)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/interface/interface.py", line 446, in communicate_login
    "Couldn't communicate with backend after %s seconds" % timeout
wandb.errors.error.Error: Couldn't communicate with backend after 15 seconds
wandb: ERROR Abnormal program exit
Traceback (most recent call last):
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 477, in init
    run = wi.init()
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 356, in init
    _backend=backend, _disable_warning=True, _settings=self.settings
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 102, in _login
    res = _backend.interface.communicate_login(key, anonymous)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/interface/interface.py", line 446, in communicate_login
    "Couldn't communicate with backend after %s seconds" % timeout
wandb.errors.error.Error: Couldn't communicate with backend after 15 seconds

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "runners/mme/mme.py", line 57, in <module>
    config=args, reinit=True, project='ssda_mme-runners')
  File "/home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py", line 376, in init
    wandb.init(*args, **kwargs)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 511, in init
    six.raise_from(Exception("problem"), error_seen)
  File "<string>", line 3, in raise_from
Exception: problem
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
samarth4149commented, Dec 18, 2020

Yes, it seems this was the best workaround. This was also helpful.

1reaction
tyomhakcommented, Sep 28, 2020

Will certainly look into it! Thanks for flagging this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Couldn't communicate with backend after 15 seconds #1287
Hi, My run exited with the error message in the title. What could this be from? Is it probably just a network error?...
Read more >
Could not reach Cloud Firestore backend. Backend didn't ...
Backend didn't respond within 10 seconds. This typically indicates that your device does not have a healthy Internet connection at the moment.
Read more >
Backend goes offline due to Latch timeouts in DS (All versions)
The purpose of this article is to provide assistance if the backend database goes offline due to "Latch timeouts" in DS.
Read more >
Common 503 errors on Fastly | Fastly Help Guides
The following describes typical timeout errors you may encounter. ... By default, the first byte timeout is set to 15 seconds.
Read more >
15: 19.3. Starting the Database Server - PostgreSQL
Consider carefully the timeout setting. systemd has a default timeout of 90 seconds as of this writing and will kill a process that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found