question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] add_sub example failed

See original GitHub issue

Problem: add_sub example failed. log as below. INFO[client.py:82] Model add_sub_i0 load failed: [StatusCode.INTERNAL] failed to load 'add_sub_i0', no version is available It’s an INFO, but finally I get empty metrices:

Server Only:
Model           GPU ID   Batch   Concurrency   Max GPU Memory Usage(MB)   Max GPU Memory Available(MB)   Max GPU Utilization(%)
triton-server   0        0       0             166.0                      14943.0                        0.0
triton-server   1        0       0             166.0                      14943.0                        0.0
triton-server   2        0       0             166.0                      14943.0                        0.0
triton-server   3        0       0             166.0                      14943.0                        0.0

Models (GPU Metrics):
Model   GPU ID   Batch   Concurrency   Model Config Path   Max GPU Memory Usage(MB)   Max GPU Memory Available(MB)   Max GPU Utilization(%)

Models (Inference):
Model   Batch   Concurrency   Model Config Path   Throughput(infer/sec)   Average Latency(us)   Max RAM Usage(MB)   Max RAM Available(MB)

Models (GPU Metrics - Failed Constraints):
Model   GPU ID   Batch   Concurrency   Model Config Path   Max GPU Memory Usage(MB)   Max GPU Memory Available(MB)   Max GPU Utilization(%)

Models (Inference - Failed Constraints):
Model   Batch   Concurrency   Model Config Path   Throughput(infer/sec)   Average Latency(us)   Max RAM Usage(MB)   Max RAM Available(MB)

All I’ve done is:

  • Pull image from ngc
    • nvcr.io/nvidia/tritonserver:21.03-py3-sdk as doc says
    • nvcr.io/nvidia/tritonserver:21.03-py3 for --triton-launch-mode=docker
  • Clone model_analyzer repo to $HOME
    • cd $HOME && git clone https://github.com/triton-inference-server/model_analyzer.git
  • Start docker container as doc says: docker run -it --rm --gpus all \ -v /var/run/docker.sock:/var/run/docker.sock \ -v $HOME/model_analyzer/examples/quick-start:/quick_start_repository \ --net=host --name model-analyzer \ nvcr.io/nvidia/tritonserver:21.03-py3-sdk /bin/bash
  • Under /workspace folder, run model-analyzer -m /quick_start_repository -n add_sub --triton-launch-mode=docker --triton-version=21.03-py3 --export-path=analysis_results --log-level=DEBUG --override-output-model-repository

Did I miss something?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
Tabriziancommented, Apr 19, 2021

I don’t get /path/to/model/repo points to where, since I do not have any model. I assume /path/to/model/repo is just model_output_directory and mount one. (Although I don’t know why must be the same both inside and outside the container.)

Correct. /path/to/model/repo is the model output directory. The reason is that when you are using the docker mode and you are mounting the socket inside the container the path provided must exist outside the model anaylzer container too. Because model analyzer writes triton configs to the output model repo, it needs to make sure that the changes are visible to the sibling container created using the docker API. Because of this both of these paths must be the same.

Regarding the error, if you change the model analyzer to the command below it should work succesfully:

model-analyzer -m /quick_start_repository -n add_sub --triton-launch-mode=docker --triton-version=21.03-py3 --export-path=analysis_results --log-level=DEBUG --override-output-model-repository --output-model-repository /data0/fumi/model_analyzer_output/model_output

Note the model_output in the end of the output model repository.

0reactions
fumihwhcommented, Apr 22, 2021

@Tabrizian Thanks. It works. I notice the different part model_output and understood now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

1354441 – DNS forwarder check is too strict - Red Hat Bugzilla
Forwarding to IP address 192.0.2.1 will always fail so any query for the sub-domain f.example.com. will always return an error (SERVFAIL or a...
Read more >
DNS forwarder check is too strict: unable to add sub-domain to ...
Forwarding to IP address 192.0.2.1 will always fail so any query for the sub-domain f.example.com. will always return an error (SERVFAIL or a...
Read more >
Parse error with CSS custom properties with default values ...
Any example of this for removing postcss-calc from Gatsby build please? I can't achieve this... I tried to edit my postcss.config.js , ...
Read more >
[llvm-dev] [EXT] [A bug?] Failed to use BuildMI to add R7 - R12 ...
Failed to use BuildMI to add R7 - R12 registers > for tADDi8 and tPUSH of ARM > > Hi all, > >...
Read more >
App Auth: error Attempting to load… | Apple Developer Forums
I'm using AppAuth pod to handle user login with Azure in my app. I followed this sample : https://github.com/openid/AppAuth-iOS/tree/master/Examples which ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found