question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[CLI] Network error (ReadTimeout), entering retry loop.

See original GitHub issue

Description My program just hangs after wandb cannot connect and log data. The error is Network error (ReadTimeout), entering retry loop. See wandb/debug-internal.log for full traceback.

Wandb features wandb.log()

Environment

  • OS: Ubuntu 18.04
  • Environment: terminal on local machine
  • Python Version: 3.7.10
  • WandB: 0.10.24

Here is part of the debug-internal.log:

2021-04-06 04:11:40,754 DEBUG   SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:11:49,658 DEBUG   HandlerThread:14386 [handler.py:handle_request():120] handle_request: status
2021-04-06 04:11:49,659 DEBUG   SenderThread:14386 [sender.py:send():160] send: request
2021-04-06 04:11:49,659 DEBUG   SenderThread:14386 [sender.py:send_request():169] send_request: status
2021-04-06 04:12:10,483 WARNING Thread-5  :14386 [util.py:request_with_retry():768] requests_with_retry encountered retryable exception: 502 Server Error: Bad Gateway for url: https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream. func: <bound method Session.post of <requests.sessions.Session object at 0x7fe833fc7bd0>>, response: b'\n<html><head>\n<meta http-equiv="content-type" content="text/html;charset=utf-8">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>\n', args: ('https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream',), kwargs: {'json': {'complete': False, 'failed': False}}
2021-04-06 04:12:15,821 WARNING Thread-5  :14386 [util.py:request_with_retry():768] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream. func: <bound method Session.post of <requests.sessions.Session object at 0x7fe833fc7bd0>>, response: b'{"error":"Error 1040: Too many connections"}\n', args: ('https://api.wandb.ai/files/prashant-pand3y/unet-us-datasets/ubc-none-Dice-linear-8-weights-20210405-210012/file_stream',), kwargs: {'json': {'complete': False, 'failed': False}}
2021-04-06 04:12:23,298 ERROR   SenderThread:14386 [retry.py:__call__():111] Retry attempt failed:
Traceback (most recent call last):
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.7/http/client.py", line 1369, in getresponse
    response.begin()
  File "/usr/lib/python3.7/http/client.py", line 310, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.7/http/client.py", line 271, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.7/ssl.py", line 1071, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.7/ssl.py", line 929, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/util/retry.py", line 367, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/packages/six.py", line 686, in reraise
    raise value
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/home/prashant/py37/lib/python3.7/site-packages/urllib3/connectionpool.py", line 306, in _raise_timeout
    raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/prashant/py37/lib/python3.7/site-packages/wandb/old/retry.py", line 96, in __call__
    result = self._call_fn(*args, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/wandb/sdk/internal/internal_api.py", line 123, in execute
    return self.client.execute(*args, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 52, in execute
    result = self._get_result(document, *args, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 60, in _get_result
    return self.transport.execute(document, *args, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/wandb/vendor/gql-0.2.0/gql/transport/requests.py", line 38, in execute
    request = requests.post(self.url, **post_args)
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/home/prashant/py37/lib/python3.7/site-packages/requests/adapters.py", line 526, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='api.wandb.ai', port=443): Read timed out. (read timeout=10)
2021-04-06 04:12:28,406 DEBUG   SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:12:41,545 DEBUG   SenderThread:14386 [sender.py:send():160] send: stats
2021-04-06 04:12:43,407 DEBUG   HandlerThread:14386 [handler.py:handle_request():120] handle_request: status

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:32 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
cu-riecommented, May 4, 2022

I have same issue in latest wandb version

1reaction
CrohnEngineercommented, Jan 22, 2022

Hey @ramit-wandb ,

Just wanted to let you know that upgrading to wandb 0.12.9 solved my problem 😃 Thank you again for your time and reply 🙏

Read more comments on GitHub >

github_iconTop Results From Across the Web

Network error (ReadTimeout), entering retry loop during sweep
I started a wandb sweep grid search and I am getting a lot of Network error (ReadTimeout), entering retry loop. from various kinds...
Read more >
Weights and Biases: Login and network errors - Stack Overflow
This error happens when I use the command: wandb login host=https://api.wandb.ai I have tried to delete the . netrc file where the api...
Read more >
Troubleshooting - Documentation - Weights & Biases - Wandb
If our library is unable to connect to the internet it will enter a retry loop and keep attempting to stream metrics until...
Read more >
smly on Twitter: "just found out that `wandb offline` can be ...
wandb: Network error (ReadTimeout), entering retry loop. ... just found out that `wandb offline` can be used to temporarily work around the ...
Read more >
Prototypical Network | Kaggle
... step size for learning rate reducing (outer loop) gamma = 0.5, # gamma for learning rate ... wandb: Network error (ReadTimeout), entering...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found