getting incorrect output shape
See original GitHub issueDescription First of all, I’m not sure if this is a bug or a misunderstanding but I haven’t been able to found a solution anywhere else.
I have set up Triton Inference Server using docker and everything was smooth. I’m serving an instance/custom version of the mobilenet_sdd that my team developed in Tensorflow. The issue I’m facing is that using the Python client, I’m getting a response that doesn’t seem to be aligned with the configuration for the model given to the server.
For this particular model, the configuration file looks something like this:
name: "my_model"
platform: "tensorflow_savedmodel"
max_batch_size: 0
input [
{
name: "input"
data_type: TYPE_FP32
dims: [ 300, 300, 3 ]
format: FORMAT_NHWC
}
]
output [
{
name: "predictions"
data_type: TYPE_FP32
dims: [ 1917, 17 ]
label_filename: "labels.txt"
}
]
So, I’m expecting an output shaped [1917, 17]
. However, I’m getting a bytes array that’s shaped [1, 3]
. I tried to interpret the buffer numpy.frombuffer using numpy.float32
but I doesn’t get anything close to what I’m expecting.
The model itself shouldn’t be the problem because it’s been working in prod for some time and it’s been heavily tested.
For example, after getting the responses from the triton_client.infer()
calls, and getting the responses as_numpy, I end up with something like:
[[b'6.252468:9676' b'6.248357:8707' b'6.226352:10645']]
As I said, I’m pretty sure that there is a concept misunderstanding rather than a bug itself.
Triton Information What version of Triton are you using? 2.26.0
Are you using the Triton container or did you build it yourself? I’m using the docker container.
To Reproduce Steps to reproduce the behaviour.
This can’t be reproduced since it is a proprietary model, but as said, I think I just need some context on how the server itself works.
Expected behavior I expect the model responses to be shaped as I specified in the configuration file.
Issue Analytics
- State:
- Created 10 months ago
- Comments:7 (2 by maintainers)
Top GitHub Comments
The output looks like you have requested the use of classification extension rather than directly returning the model output, can you check your client code to see if this is set for the request?
@nv-kmcgill53 @tanayvarshney for auto-complete question
I have managed to find the issue for the response part. I was passing the number of classes to the output object so I guess that’s what triggered the classification. Thank you so much!