Unable to run optimized BERT: model shape expected by framework [-1,-1] doesn't match model configuration shape [-1,-1]
See original GitHub issueDescription When deploying optimized model I get error:
INVALID_ARGUMENT: getPluginCreator could not find plugin CustomEmbLayerNormPluginDynamic version 1 safeDeserializationUtils.cpp (259) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
I1024 17:29:43.767217 538 model_config_utils.cc:203] autofilled config: name: "bertoptimized_plan"
platform: "tensorrt_plan"
max_batch_size: 1
input {
name: "input_ids"
data_type: TYPE_INT32
dims: -1
dims: -1
}
input {
name: "segment_ids"
data_type: TYPE_INT32
dims: -1
dims: -1
}
input {
name: "input_mask"
data_type: TYPE_INT32
dims: -1
dims: -1
}
output {
name: "cls_squad_logits"
data_type: TYPE_FP32
dims: -1
dims: -1
dims: 2
dims: 1
dims: 1
}
default_model_filename: "model.plan"
** TRTIS Information ** What version of TRTIS are you using? nvcr.io/nvidia/tensorrtserver:19.09-py3 Are you using the TRTIS container or did you build it yourself? Did not Build
To Reproduce Steps to reproduce the behavior: Follow: https://github.com/NVIDIA/TensorRT/tree/master/demo/BERT#example-workflow using container: nvcr.io/nvidia/tensorflow:19.09-py3 Save model like: build/sample_bert -d $BERT_PATH -d $OUTPUT_PATH --fp16 --nheads 12 --saveEngine model.plan Add plan to model registry on trtis
Expected behavior A clear and concise description of what you expected to happen. The optimized model to run and accept inference requests
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
the output of the build command is
libcommon.so
and inlibbert_plugins.so
TensorRT/demo/BERT/build/libcommon.soI run the server:
LD_PRELOAD=/opt/tensorrtserver/libbert_plugins.so:/opt/tensorrtserver/libcommon.so trtserver --model-repository=/models --strict-model-config=false
I just copied the files from TensorRT/demo/BERT/build
This is a known issue. Tensorrtserver 19.09 release does not support autofiller for dynamic shaped inputs. TensorRT 6 dynamic-shaped plans need explicit batch dimension in their bindings. TRTIS doesn’t need an explicit batch dimension in config.pbtxt as it is added implicitly.
So, provide your own config.pbtxt as:
And run server as
You can also wait for 19.10 release which has the the autofiller support for dynamic shaped plans or build your own server from r19.10 branch.