question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"failed to connect to all addresses" occurs by chance

See original GitHub issue

Description When loading the same developed triton module (python backend), sometime success, sometime get ''failed to connect to all addresses"

Here is a example to show we load the same model, first time fail, and then second time success. DEVELOPED_MODEL is our model name.

triton-srv    | I0723 03:41:11.552163 1 model_repository_manager.cc:1065] loading: DEVELOPED_MODEL:1
triton-srv    | I0723 03:41:11.662098 1 python.cc:604] TRITONBACKEND_ModelInstanceInitialize: DEVELOPED_MODEL(CPU device 0)
triton-srv    | E0723 03:46:11.726061 1 model_repository_manager.cc:1242] failed to load 'DEVELOPED_MODEL' version 1: Internal: failed to connect to all addresses
triton-srv    | I0723 03:49:14.709051 1 model_repository_manager.cc:1065] loading: DEVELOPED_MODEL:1
triton-srv    | I0723 03:49:14.853075 1 python.cc:604] TRITONBACKEND_ModelInstanceInitialize: DEVELOPED_MODEL(CPU device 0)
triton-srv    | 2021-07-23 03:50:14,859 - DEVELOPED_MODEL- INFO - Logger Set Up!
triton-srv    | 2021-07-23 03:50:15,706 - DEVELOPED_MODEL - INFO - Model Initialization Complete!
triton-srv    | I0723 03:50:15.707758 1 model_repository_manager.cc:1239] successfully loaded 'DEVELOPED_MODEL' version 1

We also set python logging in the module, the logger is shown on

triton-srv    | 2021-07-23 03:50:14,859 - DEVELOPED_MODEL- INFO - Logger Set Up!
triton-srv    | 2021-07-23 03:50:15,706 - DEVELOPED_MODEL - INFO - Model Initialization Complete!

But the first time does not show any python logger, we guess something happens or delay in the triton server.

Our first solution is referring this issue to give a larger timeout value, it works and successfully eliminate the possibility of this kind of errors, but still it fails by chance. (BTW, we set timeout value as 60000 ms)

Triton Information 21.03 - Triton Container

To Reproduce it occurs by chance …

Expected behavior Anyone could explain any possible causes of this problem? I hope I have another solution except for to set timeout to 120000 ms …

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
GuanLuocommented, Jul 23, 2021

Can you try with the latest version of Triton image?

0reactions
SraRodcommented, Sep 1, 2021

@Tabrizian Just update situations. After then, we update triton to 21.07, and this error not occur again until now. (but 21.07 seems use more computation resources)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Internal: failed to connect to all addresses · Issue #2417 - GitHub
Description When launching Triton on Jetson TX2 with python backend sometimes happens this error: Error:
Read more >
gRPC: 14 UNAVAILABLE: failed to connect to all addresses
That gRPC error means that no server is running at the address you are trying to connect to, or a connection to that...
Read more >
500 gitlab-rails GRPC::Unavailable failed to connect to all ...
Hi, We are running a docker gitlab instance that was subject to a critical security issue (12.9.2, CVE-2021-22205), resulting in the...
Read more >
What's an IP Conflict and How Do You Resolve It? - MakeUseOf
The IP address error could have been a small glitch, which a reboot will resolve. Restarting your router and modem (if they're separate...
Read more >
TF31002-Unable to connect - Azure DevOps & TFS
Receive the error when you try to connect to Azure DevOps Services or an on-premises Azure DevOps Server.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found