question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] ERROR gcs_utils.py:137 – Failed to send request to gcs

See original GitHub issue

Search before asking

  • I searched the issues and found no similar issues.

Ray Component

Ray Core

What happened + What you expected to happen

On ray start on worker I expected the head and worker to be connected, but instead I see this error only on ray 1.9 Worker can talk to the head in ray version 1.8.

$ ray start --address='100.96.24.172:6379' --redis-password='5241590000000000'
Local node IP: 100.96.191.45
2022-02-05 16:39:15,662 ERROR gcs_utils.py:137 -- Failed to send request to gcs, reconnecting. Error <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1644079155.661725492","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1644079155.661724203","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}"
>
2022-02-05 16:39:16,665 ERROR gcs_utils.py:137 -- Failed to send request to gcs, reconnecting. Error <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1644079156.664823970","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1644079156.664822821","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}"
>

Versions / Dependencies

  • Ray 1.9
  • Python 3.7.10
  • OS Linux

Reproduction script

Install the above versions on 2 nodes.

Start the head node using ray.init()

Start the worker node using `ray start --address=… --redis-password=…

Anything else

https://discuss.ray.io/t/error-gcs-utils-py-137-failed-to-send-request-to-gcs/4936

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
mwtiancommented, Feb 15, 2022

@Nithanaroy I sent a message to you on discuss.ray.io to setup a debugging session. What I want to investigate together:

  1. See if the issue exists in Ray 1.10.0
  2. After starting Ray 1.9 head node, use Redis-cli to try to read the value of key GcsServerAddress from the head node address, e.g. 100.96.24.172:6379. Verify if it is set correctly.
  3. curl the GCS address from the worker node.

Please let us know if having a debugging session would help!

0reactions
mwtiancommented, Feb 19, 2022

Closing this issue since there is a workaround. But feel free to reopen if you think this use case needs to be supported!

Read more comments on GitHub >

github_iconTop Results From Across the Web

ERROR gcs_utils.py:137 -- Failed to send request to gcs - Ray
I'm facing issue setting up a cluster manually using ray start. The head node starts successfully and I see the below message.
Read more >
Troubleshooting | Cloud Storage
Troubleshooting · Logging raw requests · Error codes · Diagnosing Google Cloud console errors · Static website errors · Latency · Proxy servers...
Read more >
node.js - Error posting object to GCS with Signed Post policy
The error details say: Failed condition: {"key":"${filename}"}. I took a look at the code written above and I see that you're trying to ......
Read more >
Loading File From GCS Failed with "Not Found ... - Issue Tracker
The problem here is that i get a success upload from the GCS API but it seems not available for BQ. Unfortunately, We...
Read more >
[错误]错误gcs_utils.py:137 - 无法向GCS发送请求 - 编程技术网
[Bug] ERROR gcs_utils.py:137 – Failed to send request to gcs. Search before asking I searched the issues and found no similar issues.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found