question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Triton server multiple initialization errors, under kubernetes

See original GitHub issue

Hello, hope someone can help me, I am reading the following in the log;

$ kubectl logs test-triton-triton-inference-server-8b7bc6c84-drpn9

=============================
== Triton Inference Server ==
=============================

NVIDIA Release 20.08 (build 15533555)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
find: '/usr/lib/ssl/private': Permission denied

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for the inference server.  NVIDIA recommends the use of the following flags:
   nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

I1016 08:14:03.413740 1 metrics.cc:184] found 1 GPUs supporting NVML metrics
I1016 08:14:03.419300 1 metrics.cc:193]   GPU 0: GeForce GTX 1080 Ti
I1016 08:14:03.419579 1 server.cc:119] Initializing Triton Inference Server
error: creating server: Internal - Unable to create GCS client. Check account credentials.

I am new to kubernetes environment, what can be done to make it function properly? (there are 2 equal GPU hardware installed in the system, and this is a local install. I prefer to use persistent volume for model repo, but failed. Trying gc now but there are still some problems. Any comments for the above appreciated.

Thanks, Alper

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
deadeyegoodwincommented, Oct 21, 2020

I’m still not sure if you are saying that you believe something is wrong or not. What are you expecting curl to do in these cases? By default curl just prints the body of the response (as in the last case). But for the first 3 there is nothing in the response body. All these endpoints do is return an HTTP status. You can see that status with -v or by using the -w flag as I showed… that is entirely up to how you want to use curl.

0reactions
ontheway16commented, Oct 21, 2020

I was hoping to see some response, in fact, like the one in this link. (As presented few messages above, /api/status also returns 400, with -v). But anyway, directing the client script to the pod_IP:port returns the correct output of inference, so I can assume no problems here, thank you again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Triton Inference Server's health status shows 'Connection peer ...
Hi,. Description. Facing error while connecting to Triton inference server(seems server startup is having errors). Environment.
Read more >
Accelerating NLP at scale with NVIDIA Triton, Seldon Core ...
Deploying and Scaling AI Applications with the NVIDIA TensorRT Inference Server on Kubernetes · ChatGPT Changes Everything, But Not in the Way ...
Read more >
Triton Inference Server in GKE - NVIDIA - Google Kubernetes
With Triton Inference Server, we have the ability to mark a model as PRIORITY_MAX. This means when we consolidate multiple models in the...
Read more >
V2 Inference Protocol - KServe Documentation Website
This protocol is endorsed by NVIDIA Triton Inference Server, ... The “server live” API can be used directly to implement the Kubernetes livenessProbe....
Read more >
Serving TensorRT Models with NVIDIA Triton Inference Server
In real-time AI model deployment en masse, efficiency of model inference and hardware/GPU usage is paramount.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found