question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

triton bert predictor - can not find config.pbtxt

See original GitHub issue

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.] I’m trying to deploy the Triton Bert model on the sample.
By the way, the transformer was deployed, but I found that the predictor had fallen into a crashloopbackoff state.

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

kubectl -n kfserving-test logs bert-large-predictor-default-rtjz2-deployment-f4cc44d4c-zspww -c kfserving-container
...
2020-09-03 02:41:47.565925: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
I0903 02:41:47.612648 1 metrics.cc:164] found 1 GPUs supporting NVML metrics
I0903 02:41:47.618143 1 metrics.cc:173]   GPU 0: Tesla V100-SXM2-16GB
I0903 02:41:47.618395 1 server.cc:120] Initializing Triton Inference Server
E0903 02:41:47.836537 1 model_repository_manager.cc:1519] failed to open text file for read /mnt/models/1/config.pbtxt: No such file or directory
error: creating server: INTERNAL - failed to load all models
kubectl -n kfserving-test logs bert-large-predictor-default-rtjz2-deployment-f4cc44d4c-zspww -c storage-initializer
[I 200903 02:35:12 initializer-entrypoint:13] Initializing, args: src_uri [gs://kfserving-samples/models/triton/bert] dest_path[ [/mnt/models]
[I 200903 02:35:12 storage:35] Copying contents of gs://kfserving-samples/models/triton/bert to local
[I 200903 02:35:12 storage:111] Downloading: /mnt/models/1/model.savedmodel/saved_model.pb
[I 200903 02:35:12 storage:111] Downloading: /mnt/models/1/model.savedmodel/variables/variables.data-00000-of-00001
[I 200903 02:35:42 storage:111] Downloading: /mnt/models/1/model.savedmodel/variables/variables.index
[I 200903 02:35:42 storage:111] Downloading: /mnt/models/config.pbtxt
[I 200903 02:35:42 storage:60] Successfully copied gs://kfserving-samples/models/triton/bert to /mnt/models

The predictor looks for config.pbtxt in /mnt/models/1/. Is it correct to download to /mnt/models/?

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version: 0.4.0
  • Kubeflow version:
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.9-eks-4c6976", GitCommit:"4c6976793196d70bc5cd29d56ce5440c9473648e", GitTreeState:"clean", BuildDate:"2020-07-17T18:46:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release):

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
yuzisuncommented, Sep 3, 2020

@mokpolar sorry uploaded the bert model on the wrong level ! gs://kfserving-samples/models/triton/bert |_ config.pbtxt |_ 1/

it should be gs://kfserving-samples/models/triton/bert |_ bert_tf_v2_large_fp16_128_v2 –|_ config.pbtxt –|_ 1/

1reaction
mokpolarcommented, Sep 3, 2020

@mokpolar sorry uploaded the bert model on the wrong level ! gs://kfserving-samples/models/triton/bert |_ config.pbtxt |_ 1/

it should be gs://kfserving-samples/models/triton/bert |_ bert_tf_v2_large_fp16_128_v2 –|_ config.pbtxt –|_ 1/

I uploaded the model to pvc and changed the directories according to the path you told me. The pod has been deployed, thank you!

kubectl get pod -n kfserving-test
NAME                                                              READY   STATUS    RESTARTS   AGE
bert-large-predictor-default-5rjvx-deployment-76d5f587-5xd8v      2/2     Running   0          84s
bert-large-transformer-default-mw8mm-deployment-6c4f66c9c7ml9qs   2/2     Running   0          83s
Read more comments on GitHub >

github_iconTop Results From Across the Web

triton bert predictor - can not find config.pbtxt · Issue #1075
I'm trying to deploy the Triton Bert model on the sample. By the way, the transformer was deployed, but I found that the...
Read more >
Triton - KServe Documentation Website
The config.pbtxt defines a model configuration that provides the required and optional information for the model. A minimal model configuration must specify ...
Read more >
Identifying the Best AI Model Serving Configurations at Scale ...
This post presents an overview of NVIDIA Triton Model Analyzer and how it can be used to find the optimal AI model-serving configuration...
Read more >
triton inference server: deploy model with input shape BxN ...
Now, as you can see, my model takes a tensor with shape (BxN) , where B is the batch size. How do I...
Read more >
Modern Spam Detection with DistilBERT on NVIDIA Triton
Task #2: Next Sentence Prediction (NSP). The MLM does not support the concept of the relationship between sentences. In this task, the model...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found