Couldn't communicate with backend after 15 seconds
See original GitHub issueHi,
My run exited with the error message in the title. What could this be from? Is it probably just a network error? In that case, I don’t think wandb should crash the entire run with no notification.
Here’s the full trace
Ignoring settings passed to wandb.setup() which has already been configured.
Problem at: /home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py 376 init
Traceback (most recent call last):
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 477, in init
run = wi.init()
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 356, in init
_backend=backend, _disable_warning=True, _settings=self.settings
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 102, in _login
res = _backend.interface.communicate_login(key, anonymous)
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/interface/interface.py", line 446, in communicate_login
"Couldn't communicate with backend after %s seconds" % timeout
wandb.errors.error.Error: Couldn't communicate with backend after 15 seconds
wandb: ERROR Abnormal program exit
Traceback (most recent call last):
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 477, in init
run = wi.init()
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 356, in init
_backend=backend, _disable_warning=True, _settings=self.settings
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 102, in _login
res = _backend.interface.communicate_login(key, anonymous)
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/interface/interface.py", line 446, in communicate_login
"Couldn't communicate with backend after %s seconds" % timeout
wandb.errors.error.Error: Couldn't communicate with backend after 15 seconds
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "runners/mme/mme.py", line 57, in <module>
config=args, reinit=True, project='ssda_mme-runners')
File "/home/grad3/samarthm/bitbucket-misc/ssda_mme/utils/ioutils.py", line 376, in init
wandb.init(*args, **kwargs)
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 511, in init
six.raise_from(Exception("problem"), error_seen)
File "<string>", line 3, in raise_from
Exception: problem
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
File "/home/grad3/samarthm/anaconda3/envs/pytorch3conda/lib/python3.7/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (2 by maintainers)
Top Results From Across the Web
Couldn't communicate with backend after 15 seconds #1287
Hi, My run exited with the error message in the title. What could this be from? Is it probably just a network error?...
Read more >Could not reach Cloud Firestore backend. Backend didn't ...
Backend didn't respond within 10 seconds. This typically indicates that your device does not have a healthy Internet connection at the moment.
Read more >Backend goes offline due to Latch timeouts in DS (All versions)
The purpose of this article is to provide assistance if the backend database goes offline due to "Latch timeouts" in DS.
Read more >Common 503 errors on Fastly | Fastly Help Guides
The following describes typical timeout errors you may encounter. ... By default, the first byte timeout is set to 15 seconds.
Read more >15: 19.3. Starting the Database Server - PostgreSQL
Consider carefully the timeout setting. systemd has a default timeout of 90 seconds as of this writing and will kill a process that...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, it seems this was the best workaround. This was also helpful.
Will certainly look into it! Thanks for flagging this issue.