Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Not able to provide <perf-analyzer-flags> 'shape' for perf_analyzer in config.yaml, results in termination

See original GitHub issue

Hi,

I have added according to perf-analyzer-flags two flags to the perf_analyzer as shown below since I have a model with dynamic input shapes. --shape images:3,640,640 works when used with perf_analyzer separately.

But adding to my config.yaml:

perf_analyzer_flags: percentile: 95 shape: 'images:3,640,640'

or: perf_analyzer_flags: percentile: 95 shape: images:3,640,640

Results in a termination: ERROR[perf_analyzer.py:164] perf_analyzer was terminated by signal: SIGABRT

But when I remove the shape flag, this error is gone, but I still need to provide input shape since: failed with exit status 1 : error: failed to create concurrency manager: input images contain a dynamic shape, provide shapes to send along with the request

So I am not sure how to add images:3,640,640 to the shape flag.

Issue Analytics

State:
Created 2 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

Michelvl92commented, Oct 13, 2021

@aramesh7 I think this log provides a lot of information about what goes wrong. It looks like the --shape is parsed wrong, as you can see in the log below:

2021-10-13 08:58:37.52 DEBUG[perf_analyzer.py:128] Running perf_analyzer with args: -m yolov5x_onnx_dynamic -b 1 -u localhost:8000 -i http --concurrency-range=2 --percentile=95 --shape=i --shape=m --shape=a --shape=g --shape=e --shape=s --shape=: --shape=3 --shape=, --shape=6 --shape=4 --shape=0 --shape=, --shape=6 --shape=4 --shape=0 --measurement-mode=count_windows
2021-10-13 08:58:38.68 ERROR[perf_analyzer.py:164] perf_analyzer was terminated by signal: SIGABRT

Which should be: images:3,640,640

It works perfectly when using performance analyzer directly with the following arguments: perf_analyzer -m yolov5x_onnx_dynamic -b 1 -u localhost:8000 -i http --concurrency-range=2 --percentile=95 --shape images:3,640,640

Steps to reproduce:

Start a triton server: docker run --gpus=all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /model_repository:/models nvcr.io/nvidia/tritonserver:21.09-py3 tritonserver --model-repository=/models --model-control-mode=explicit
Start client docker: docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:21.09-py3-sdk
Use the following config file:

model_repository: '/models'
profile_models: 'yolov5x_onnx_dynamic'
override_output_model_repository: true

triton_launch_mode: 'remote'
client_protocol: 'http'
triton_http_endpoint: 'localhost:8000'

concurrency:
    start: 1
    stop: 1
    step: 1

batch_sizes:
    start: 1
    stop: 1
    step: 1

perf_analyzer_flags:
    percentile: 95
    shape: 'images:3,640,640'

Execute performance analyzer: model-analyzer -v profile -f config.yaml
Full log output:

root@v2119:/workspace# model-analyzer -v profile -f /models/test.yaml
2021-10-13 08:58:27.620 DEBUG[entrypoint.py:271] {'config_file': '/models/test.yaml', 'checkpoint_directory': './checkpoints', 'monitoring_interval': 1.0, 'duration_seconds': 5, 'collect_cpu_metrics': False, 'gpus': ['all'], 'model_repository': '/models', 'output_model_repository_path': './output_model_repository', 'override_output_model_repository': True, 'client_max_retries': 50, 'client_protocol': 'http', 'perf_analyzer_flags': {'percentile': '95', 'shape': 'images:3,640,640'}, 'triton_server_flags': {}, 'triton_server_environment': {}, 'objectives': {'perf_throughput': 10}, 'constraints': {}, 'profile_models': [{'model_name': 'yolov5x_onnx_dynamic', 'cpu_only': False, 'objectives': {'perf_throughput': 10}, 'parameters': {'batch_sizes': [1], 'concurrency': [1]}, 'perf_analyzer_flags': {'percentile': '95', 'shape': 'images:3,640,640'}}], 'batch_sizes': [1], 'concurrency': [1], 'perf_analyzer_timeout': 600, 'perf_analyzer_cpu_util': 1600.0, 'perf_analyzer_path': 'perf_analyzer', 'perf_output': False, 'perf_output_path': None, 'perf_analyzer_max_auto_adjusts': 10, 'triton_launch_mode': 'remote', 'triton_docker_image': 'nvcr.io/nvidia/tritonserver:21.09-py3', 'triton_http_endpoint': 'localhost:8000', 'triton_grpc_endpoint': 'localhost:8001', 'triton_metrics_url': 'http://localhost:8002/metrics', 'triton_server_path': 'tritonserver', 'triton_output_path': None, 'triton_docker_mounts': [], 'triton_docker_labels': {}, 'triton_install_path': '/opt/tritonserver', 'run_config_search_max_concurrency': 1024, 'run_config_search_max_instance_count': 5, 'run_config_search_disable': False, 'run_config_search_max_preferred_batch_size': 16, 'run_config_search_preferred_batch_size_disable': False}
2021-10-13 08:58:27.620 INFO[gpu_device_factory.py:50] Initiliazing GPUDevice handles...
2021-10-13 08:58:28.728 INFO[gpu_device_factory.py:246] Using GPU 0 NVIDIA GeForce RTX 3090 with UUID GPU-cbaea145-6239-1810-c34e-f0322970ffe8
2021-10-13 08:58:28.729 WARNING[entrypoint.py:253] Overriding the output model repo path "./output_model_repository"...
2021-10-13 08:58:28.730 INFO[entrypoint.py:78] Using remote Triton Server...
2021-10-13 08:58:28.730 WARNING[entrypoint.py:83] GPU memory metrics reported in the remote mode are not accuracte. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
2021-10-13 08:58:28.730 WARNING[entrypoint.py:90] Config sweep parameters are ignored in the "remote" mode because Model Analyzer does not have access to the model repository of the remote Triton Server.
2021-10-13 08:58:28.731 INFO[analyzer_state_manager.py:120] Loaded checkpoint from file ./checkpoints/0.ckpt
2021-10-13 08:58:28.809 INFO[analyzer.py:104] Profiling server only metrics...
2021-10-13 08:58:30.835 INFO[model_manager.py:91] Running auto config search for model: yolov5x_onnx_dynamic
2021-10-13 08:58:30.835 INFO[run_search.py:151] Will sweep only through the concurrency values...
2021-10-13 08:58:30.835 INFO[run_search.py:290] [Search Step] Concurrency set to 2.
2021-10-13 08:58:30.884 INFO[client.py:83] Model yolov5x_onnx_dynamic loaded.
2021-10-13 08:58:30.886 INFO[client.py:107] Model yolov5x_onnx_dynamic unloaded.
2021-10-13 08:58:37.36 INFO[client.py:83] Model yolov5x_onnx_dynamic loaded.
2021-10-13 08:58:37.37 INFO[model_manager.py:221] Profiling model yolov5x_onnx_dynamic...
2021-10-13 08:58:37.52 DEBUG[perf_analyzer.py:128] Running perf_analyzer with args: -m yolov5x_onnx_dynamic -b 1 -u localhost:8000 -i http --concurrency-range=2 --percentile=95 --shape=i --shape=m --shape=a --shape=g --shape=e --shape=s --shape=: --shape=3 --shape=, --shape=6 --shape=4 --shape=0 --shape=, --shape=6 --shape=4 --shape=0 --measurement-mode=count_windows
2021-10-13 08:58:38.68 ERROR[perf_analyzer.py:164] perf_analyzer was terminated by signal: SIGABRT
2021-10-13 08:58:39.56 INFO[analyzer_state_manager.py:158] Saved checkpoint to ./checkpoints/1.ckpt.
2021-10-13 08:58:39.56 INFO[analyzer.py:122] Finished profiling. Obtained measurements for models: [].

0reactions

matthewkotilacommented, Dec 10, 2021

this issue has been resolved with https://github.com/triton-inference-server/model_analyzer/pull/268

Top Results From Across the Web

Reading YAML file with Python results in AttributeError

The syntax in your config file is not correct: in YAML, keys are separated from values by a colon+space. Should work if the...

config.yaml — OctoPrint master documentation

If not specified via the command line, the main configuration file config.yaml for OctoPrint is expected in its settings folder, which unless defined ......

Complete tutorial on how to use Hydra in Machine Learning ...

This is a good way to save intermediate results when you are developing something. You can use this feature to save the accuracy...

Teleport Configuration Reference | Teleport Docs

Teleport uses the YAML file format for configuration. A full configuration ... Using a copy of the reference configuration will have unintended effects....

YAML: The Missing Battery in Python

You'll also serialize Python objects and create a YAML syntax highlighter. ... As a result, YAML is more applicable to configuration files ...