question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model-analyser, on remote host and docker.

See original GitHub issue

Can you please, help me with this issue?

I have generated the models that works with triton-20.09 in a stand alone Triton-inference-server container. I have built the models-analyser and it by default supports triton-inference 20.11. While, I am passing models and plugins that are generated in 20.09 it is giving me an error when loading with model-analyser since, model-analyser supports 20.11. On the other side, when I am generating models and plugins with trt ngc container-20.11 and loading in 20.11 model-analyser I am able to run the model-analyser without any issue. My requirement is to load the models and plugins in model-analyser that are generated for 20.09.

Running

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash

The models and plugins that are given in the above command, are generated for 20.09-py3. The models are loaded fine with 20.09-py3 triton inference server.

Command inside the docker.

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3

Error

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3
2021-01-23 19:39:10.854 INFO[entrypoint.py:368] Triton Model Analyzer started: config={'model_repository': '/models/', 'model_names': 'yolo1', 'batch_sizes': '1', 'concurrency': '1', 'export': None, 'export_path': '.', 'filename_model_inference': 'metrics-model-inference.csv', 'filename_model_gpu': 'metrics-model-gpu.csv', 'filename_server_only': 'metrics-server-only.csv', 'max_retries': 100, 'duration_seconds': 5.0, 'monitoring_interval': 0.01, 'client_protocol': 'grpc', 'perf_analyzer_path': 'perf_analyzer', 'perf_measurement_window': 5000, 'no_perf_output': None, 'triton_launch_mode': 'docker', 'triton_version': '20.09-py3', 'log_level': 'INFO', 'triton_http_endpoint': 'localhost:8000', 'triton_grpc_endpoint': 'localhost:8001', 'triton_metrics_url': 'http://localhost:8002/metrics', 'triton_server_path': 'tritonserver', 'triton_output_path': None, 'gpus': ['all'], 'config_file': None}
2021-01-23 19:39:10.859 INFO[entrypoint.py:105] Starting a Triton Server using docker...
2021-01-23 19:39:10.859 INFO[driver.py:236] init
2021-01-23 19:39:13.687 INFO[entrypoint.py:209] Triton Server is ready.
2021-01-23 19:39:14.714 INFO[entrypoint.py:383] Starting perf_analyzer...
2021-01-23 19:39:14.714 INFO[analyzer.py:91] Profiling server only metrics...
2021-01-23 19:39:15.737 INFO[monitor.py:74] Using GPU(s) with UUID(s) = { GPU-5df6aea1-a690-25ee-c16e-bd46a1d95792 } for the analysis.
2021-01-23 19:39:21.852 ERROR[entrypoint.py:387] Model Analyzer encountered an error: Unable to load the model : [StatusCode.INTERNAL] failed to load 'yolo1', no version is available
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 79, in load_model
    self._client.load_model(model.name())
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 555, in load_model
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] failed to load 'yolo1', no version is available

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 384, in main
    run_analyzer(config, analyzer, client, run_configs)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 323, in run_analyzer
    client.load_model(model=model)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 82, in load_model
    f"Unable to load the model : {e}")
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Unable to load the model : [StatusCode.INTERNAL] failed to load 'yolo1', no version is available
2021-01-23 19:39:21.854 INFO[server_docker.py:128] Stopping triton server.

Also how do we Run docker on remote mode

Stand-alone-inference server-20.09-py3

sudo docker run --gpus all --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v/home/ubuntu/cuda/sec_models:/models -v/home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" bdb0cbe1c039 tritonserver --model-repository=/models --grpc-infer-allocation-pool-size=512 --log-verbose 1

op

I0123 19:44:29.564053 1 grpc_server.cc:2078] Thread started for ModelStreamInferHandler
I0123 19:44:29.564070 1 grpc_server.cc:3897] Started GRPCInferenceService at 0.0.0.0:8001
I0123 19:44:29.564351 1 http_server.cc:2705] Started HTTPService at 0.0.0.0:8000
I0123 19:44:29.605837 1 http_server.cc:2724] Started Metrics Service at 0.0.0.0:8002

Model-analyser command

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash

inside docker

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode remote --triton-grpc-endpoint localhost:8001
2021-01-23 19:53:10.191 INFO[entrypoint.py:368] Triton Model Analyzer started: config={'model_repository': '/models/', 'model_names': 'yolo1', 'batch_sizes': '1', 'concurrency': '1', 'export': None, 'export_path': '.', 'filename_model_inference': 'metrics-model-inference.csv', 'filename_model_gpu': 'metrics-model-gpu.csv', 'filename_server_only': 'metrics-server-only.csv', 'max_retries': 100, 'duration_seconds': 5.0, 'monitoring_interval': 0.01, 'client_protocol': 'grpc', 'perf_analyzer_path': 'perf_analyzer', 'perf_measurement_window': 5000, 'no_perf_output': None, 'triton_launch_mode': 'remote', 'triton_version': '20.11-py3', 'log_level': 'INFO', 'triton_http_endpoint': 'localhost:8000', 'triton_grpc_endpoint': 'localhost:8001', 'triton_metrics_url': 'http://localhost:8002/metrics', 'triton_server_path': 'tritonserver', 'triton_output_path': None, 'gpus': ['all'], 'config_file': None}
2021-01-23 19:53:10.197 INFO[entrypoint.py:84] Using remote Triton Server...
2021-01-23 19:53:10.199 INFO[entrypoint.py:209] Triton Server is ready.
2021-01-23 19:53:10.199 INFO[driver.py:236] init
2021-01-23 19:53:11.299 INFO[entrypoint.py:383] Starting perf_analyzer...
2021-01-23 19:53:11.299 INFO[analyzer.py:91] Profiling server only metrics...
2021-01-23 19:53:12.323 INFO[monitor.py:74] Using GPU(s) with UUID(s) = { GPU-5df6aea1-a690-25ee-c16e-bd46a1d95792 } for the analysis.
2021-01-23 19:53:18.438 ERROR[entrypoint.py:387] Model Analyzer encountered an error: Unable to load the model : [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 79, in load_model
    self._client.load_model(model.name())
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 555, in load_model
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 384, in main
    run_analyzer(config, analyzer, client, run_configs)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 323, in run_analyzer
    client.load_model(model=model)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 82, in load_model
    f"Unable to load the model : {e}")
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Unable to load the model : [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled
root@tensorgo-rppg:/opt/triton-model-analyzer# 


Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
Tabriziancommented, Jan 23, 2021

@alphapibeta Regarding your first question, there is a bug in Model Analyzer currently that requires the path inside the container to be the same as the path outside the container.

For now, I recommend loading the model in the same path as the host machine.

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/home/ubuntu/cuda/sec_models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash
model-analyzer -m /home/ubuntu/cuda/sec_models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3

/cc @xprotobeast2

Regarding your second question, you need to start the tritonserver using --model-control-mode=explicit flag when you want to use the remote mode. I’ll update the doc to reflect this. Thanks for pointing this out.

0reactions
naveengoginenicommented, Jan 27, 2021

@xprotobeast2 Actually, I tried to load a model that does not exist after restarting the Triton Server with the --model-control-mode=explicit and then was encountered with this error (Number of retries exceeded) in the second run. However, these flags (–model-control-mode=explicit, --triton-launch-mode remote) worked when the loaded model exist in the repo.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to deploy on remote Docker hosts with docker-compose
Hello from Docker! Running on a remote host. A remote Docker host is a machine, inside or outside our local network which is...
Read more >
Develop on a remote Docker host - Visual Studio Code
Develop a container on a remote Docker host. ... This model only requires that a Docker Engine be running on a remote host...
Read more >
model_analyzer/quick_start.md at main - GitHub
The steps below will guide you through using Model Analyzer in Docker mode to profile and analyze a simple PyTorch model: add_sub. Step...
Read more >
VS Code Development Using Docker Containers on Remote ...
In this blog post, I would like to talk about how to set up the VS Code Docker development environment on a remote...
Read more >
How to Use a Remote Docker Server to Speed Up Your ...
Docker lets you offload all those tasks to a remote server so your ... It brings support for connecting to a Docker host...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found