[CLI] Network error (ReadTimeout), entering retry loop.
See original GitHub issueDescription
My program just hangs after wandb cannot connect and log data. The error is Network error (ReadTimeout), entering retry loop. See wandb/debug-internal.log for full traceback.
Wandb features wandb.log()
Environment
- OS: Ubuntu 18.04
- Environment: terminal on local machine
- Python Version: 3.7.10
- WandB: 0.10.24
Here is part of the debug-internal.log:
2021-04-06 04:11:40,754 DEBUG SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:11:49,658 DEBUG HandlerThread:14386 [handler.py:handle_request():120] handle_request: status
2021-04-06 04:11:49,659 DEBUG SenderThread:14386 [sender.py:send():160] send: request
2021-04-06 04:11:49,659 DEBUG SenderThread:14386 [sender.py:send_request():169] send_request: status
2021-04-06 04:12:10,483 WARNING Thread-5 :14386 [util.py:request_with_retry():768] requests_with_retry encountered retryable exception: 502 Server Error: Bad Gateway for url: https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream. func: <bound method Session.post of <requests.sessions.Session object at 0x7fe833fc7bd0>>, response: b'\n<html><head>\n<meta http-equiv="content-type" content="text/html;charset=utf-8">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>\n', args: ('https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream',), kwargs: {'json': {'complete': False, 'failed': False}}
2021-04-06 04:12:15,821 WARNING Thread-5 :14386 [util.py:request_with_retry():768] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream. func: <bound method Session.post of <requests.sessions.Session object at 0x7fe833fc7bd0>>, response: b'{"error":"Error 1040: Too many connections"}\n', args: ('https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream',), kwargs: {'json': {'complete': False, 'failed': False}}
2021-04-06 04:12:23,298 ERROR SenderThread:14386 [retry.py:__call__():111] Retry attempt failed:
Traceback (most recent call last):
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.7/http/client.py", line 1369, in getresponse
response.begin()
File "/usr/lib/python3.7/http/client.py", line 310, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.7/http/client.py", line 271, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/usr/lib/python3.7/ssl.py", line 1071, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.7/ssl.py", line 929, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/prashant/py37/lib/python3.7/site-packages/requests/adapters.py", line 445, in send
timeout=timeout
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 367, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 686, in reraise
raise value
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 306, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Read timed out. (read timeout=10)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/prashant/py37/lib/python3.7/site-packages/wandb/old/retry.py", line 96, in __call__
result = self._call_fn(*args, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 123, in execute
return self.client.execute(*args, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 52, in execute
result = self._get_result(document, *args, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 60, in _get_result
return self.transport.execute(document, *args, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/transport/requests.py", line 38, in execute
request = requests.post(self.url, **post_args)
File "/home/prashant/py37/lib/python3.7/site-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/requests/sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/requests/sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "/home/prashant/py37/lib/python3.7/site-packages/requests/adapters.py", line 526, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='api.wandb.ai', port=443): Read timed out. (read timeout=10)
2021-04-06 04:12:28,406 DEBUG SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:12:41,545 DEBUG SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:12:43,407 DEBUG HandlerThread:14386 [handler.py:handle_request():120] handle_request: status
Issue Analytics
- State:
- Created 2 years ago
- Comments:32 (5 by maintainers)
Top Results From Across the Web
Network error (ReadTimeout), entering retry loop during sweep
I started a wandb sweep grid search and I am getting a lot of Network error (ReadTimeout), entering retry loop. from various kinds...
Read more >Weights and Biases: Login and network errors - Stack Overflow
This error happens when I use the command: wandb login host=https://api.wandb.ai I have tried to delete the . netrc file where the api...
Read more >Troubleshooting - Documentation - Weights & Biases - Wandb
If our library is unable to connect to the internet it will enter a retry loop and keep attempting to stream metrics until...
Read more >smly on Twitter: "just found out that `wandb offline` can be ...
wandb: Network error (ReadTimeout), entering retry loop. ... just found out that `wandb offline` can be used to temporarily work around the ...
Read more >Prototypical Network | Kaggle
... step size for learning rate reducing (outer loop) gamma = 0.5, # gamma for learning rate ... wandb: Network error (ReadTimeout), entering...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I have same issue in latest wandb version
Hey @ramit-wandb ,
Just wanted to let you know that upgrading to
wandb 0.12.9solved my problem 😃 Thank you again for your time and reply 🙏