Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model-analyser, on remote host and docker.

See original GitHub issue

Can you please, help me with this issue?

I have generated the models that works with triton-20.09 in a stand alone Triton-inference-server container. I have built the models-analyser and it by default supports triton-inference 20.11. While, I am passing models and plugins that are generated in 20.09 it is giving me an error when loading with model-analyser since, model-analyser supports 20.11. On the other side, when I am generating models and plugins with trt ngc container-20.11 and loading in 20.11 model-analyser I am able to run the model-analyser without any issue. My requirement is to load the models and plugins in model-analyser that are generated for 20.09.

Running

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash

The models and plugins that are given in the above command, are generated for 20.09-py3. The models are loaded fine with 20.09-py3 triton inference server.

Command inside the docker.

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3

Error

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3
2021-01-23 19:39:10.854 INFO[entrypoint.py:368] Triton Model Analyzer started: config={'model_repository': '/models/', 'model_names': 'yolo1', 'batch_sizes': '1', 'concurrency': '1', 'export': None, 'export_path': '.', 'filename_model_inference': 'metrics-model-inference.csv', 'filename_model_gpu': 'metrics-model-gpu.csv', 'filename_server_only': 'metrics-server-only.csv', 'max_retries': 100, 'duration_seconds': 5.0, 'monitoring_interval': 0.01, 'client_protocol': 'grpc', 'perf_analyzer_path': 'perf_analyzer', 'perf_measurement_window': 5000, 'no_perf_output': None, 'triton_launch_mode': 'docker', 'triton_version': '20.09-py3', 'log_level': 'INFO', 'triton_http_endpoint': 'localhost:8000', 'triton_grpc_endpoint': 'localhost:8001', 'triton_metrics_url': 'http://localhost:8002/metrics', 'triton_server_path': 'tritonserver', 'triton_output_path': None, 'gpus': ['all'], 'config_file': None}
2021-01-23 19:39:10.859 INFO[entrypoint.py:105] Starting a Triton Server using docker...
2021-01-23 19:39:10.859 INFO[driver.py:236] init
2021-01-23 19:39:13.687 INFO[entrypoint.py:209] Triton Server is ready.
2021-01-23 19:39:14.714 INFO[entrypoint.py:383] Starting perf_analyzer...
2021-01-23 19:39:14.714 INFO[analyzer.py:91] Profiling server only metrics...
2021-01-23 19:39:15.737 INFO[monitor.py:74] Using GPU(s) with UUID(s) = { GPU-5df6aea1-a690-25ee-c16e-bd46a1d95792 } for the analysis.
2021-01-23 19:39:21.852 ERROR[entrypoint.py:387] Model Analyzer encountered an error: Unable to load the model : [StatusCode.INTERNAL] failed to load 'yolo1', no version is available
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 79, in load_model
    self._client.load_model(model.name())
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 555, in load_model
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] failed to load 'yolo1', no version is available

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 384, in main
    run_analyzer(config, analyzer, client, run_configs)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 323, in run_analyzer
    client.load_model(model=model)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 82, in load_model
    f"Unable to load the model : {e}")
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Unable to load the model : [StatusCode.INTERNAL] failed to load 'yolo1', no version is available
2021-01-23 19:39:21.854 INFO[server_docker.py:128] Stopping triton server.

Also how do we Run docker on remote mode

Stand-alone-inference server-20.09-py3

sudo docker run --gpus all --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v/home/ubuntu/cuda/sec_models:/models -v/home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" bdb0cbe1c039 tritonserver --model-repository=/models --grpc-infer-allocation-pool-size=512 --log-verbose 1

I0123 19:44:29.564053 1 grpc_server.cc:2078] Thread started for ModelStreamInferHandler
I0123 19:44:29.564070 1 grpc_server.cc:3897] Started GRPCInferenceService at 0.0.0.0:8001
I0123 19:44:29.564351 1 http_server.cc:2705] Started HTTPService at 0.0.0.0:8000
I0123 19:44:29.605837 1 http_server.cc:2724] Started Metrics Service at 0.0.0.0:8002

Model-analyser command

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash

inside docker

model-analyzer -m /models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode remote --triton-grpc-endpoint localhost:8001
2021-01-23 19:53:10.191 INFO[entrypoint.py:368] Triton Model Analyzer started: config={'model_repository': '/models/', 'model_names': 'yolo1', 'batch_sizes': '1', 'concurrency': '1', 'export': None, 'export_path': '.', 'filename_model_inference': 'metrics-model-inference.csv', 'filename_model_gpu': 'metrics-model-gpu.csv', 'filename_server_only': 'metrics-server-only.csv', 'max_retries': 100, 'duration_seconds': 5.0, 'monitoring_interval': 0.01, 'client_protocol': 'grpc', 'perf_analyzer_path': 'perf_analyzer', 'perf_measurement_window': 5000, 'no_perf_output': None, 'triton_launch_mode': 'remote', 'triton_version': '20.11-py3', 'log_level': 'INFO', 'triton_http_endpoint': 'localhost:8000', 'triton_grpc_endpoint': 'localhost:8001', 'triton_metrics_url': 'http://localhost:8002/metrics', 'triton_server_path': 'tritonserver', 'triton_output_path': None, 'gpus': ['all'], 'config_file': None}
2021-01-23 19:53:10.197 INFO[entrypoint.py:84] Using remote Triton Server...
2021-01-23 19:53:10.199 INFO[entrypoint.py:209] Triton Server is ready.
2021-01-23 19:53:10.199 INFO[driver.py:236] init
2021-01-23 19:53:11.299 INFO[entrypoint.py:383] Starting perf_analyzer...
2021-01-23 19:53:11.299 INFO[analyzer.py:91] Profiling server only metrics...
2021-01-23 19:53:12.323 INFO[monitor.py:74] Using GPU(s) with UUID(s) = { GPU-5df6aea1-a690-25ee-c16e-bd46a1d95792 } for the analysis.
2021-01-23 19:53:18.438 ERROR[entrypoint.py:387] Model Analyzer encountered an error: Unable to load the model : [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 79, in load_model
    self._client.load_model(model.name())
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 555, in load_model
    raise_error_grpc(rpc_error)
  File "/usr/local/lib/python3.6/dist-packages/tritonclient/grpc/__init__.py", line 61, in raise_error_grpc
    raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 384, in main
    run_analyzer(config, analyzer, client, run_configs)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/entrypoint.py", line 323, in run_analyzer
    client.load_model(model=model)
  File "/usr/local/lib/python3.6/dist-packages/model_analyzer/triton/client/client.py", line 82, in load_model
    f"Unable to load the model : {e}")
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: Unable to load the model : [StatusCode.UNAVAILABLE] explicit model load / unload is not allowed if polling is enabled
root@tensorgo-rppg:/opt/triton-model-analyzer#

Issue Analytics

State:
Created 3 years ago
Comments:8 (3 by maintainers)

Top GitHub Comments

2reactions

Tabriziancommented, Jan 23, 2021

@alphapibeta Regarding your first question, there is a bug in Model Analyzer currently that requires the path inside the container to be the same as the path outside the container.

For now, I recommend loading the model in the same path as the host machine.

sudo docker run --gpus 1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -ti -v /var/run/docker.sock:/var/run/docker.sock --net host --privileged -v /home/ubuntu/cuda/sec_models:/home/ubuntu/cuda/sec_models -v /home/ubuntu/cuda/plugins/:/plugins --env LD_PRELOAD="/plugins/libyolo_layer.so:/plugins/libdecodeplugin.so" triton_modelanalyzer bash

model-analyzer -m /home/ubuntu/cuda/sec_models/ -n yolo1 --batch-size 1 -c 1 --triton-launch-mode docker --triton-version 20.09-py3

/cc @xprotobeast2

Regarding your second question, you need to start the tritonserver using --model-control-mode=explicit flag when you want to use the remote mode. I’ll update the doc to reflect this. Thanks for pointing this out.

0reactions

naveengoginenicommented, Jan 27, 2021

@xprotobeast2 Actually, I tried to load a model that does not exist after restarting the Triton Server with the --model-control-mode=explicit and then was encountered with this error (Number of retries exceeded) in the second run. However, these flags (–model-control-mode=explicit, --triton-launch-mode remote) worked when the loaded model exist in the repo.