question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

tensorflow_text support for Triton

See original GitHub issue

Description A clear and concise description of what the bug is. Triton 21.10 does not support RegexSplitWithOffsets op. Similar to https://github.com/tensorflow/text/issues/200 or https://github.com/tensorflow/serving/issues/1490

Triton Information What version of Triton are you using? I am using triton version: 21.10-py3

Are you using the Triton container or did you build it yourself? I am using Triton container.

To Reproduce Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). I trained my model using Tensorflow 2.5.0. + Google’s Bert Basically it is a text classification model with BertTokenizer + BertModel. Its input is a piece of text (String), output is an array of logits, indicating the 31 multi-label classification results.

I saved my model with savedModel fomat and made my config.pbtxt file as follows:

name: "aihivebox-intent"
platform: "tensorflow_savedmodel"
max_batch_size : 0
input [
  {
    name: "text_input"
    data_type: TYPE_STRING
    dims: [ -1 ]
  }
]
output [
  {
    name: "mlp"
    data_type: TYPE_FP32
    dims: [-1,31]
  }
]

The triton server starts normally, but when i do inference, it gives me this:

I0311 01:29:19.137185 1 grpc_server.cc:4117] Started GRPCInferenceService at 0.0.0.0:8001
I0311 01:29:19.137930 1 http_server.cc:2815] Started HTTPService at 0.0.0.0:8000
I0311 01:29:19.179941 1 http_server.cc:167] Started Metrics Service at 0.0.0.0:8002
2022-03-11 01:30:41.481080: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:568] function_optimizer failed: Not found: Op type not registered 'RegexSplitWithOffsets' in binary running on bd61380e8e17. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-03-11 01:30:41.629883: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:568] function_optimizer failed: Not found: Op type not registered 'RegexSplitWithOffsets' in binary running on bd61380e8e17. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-03-11 01:30:42.270766: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:568] function_optimizer failed: Not found: Op type not registered 'RegexSplitWithOffsets' in binary running on bd61380e8e17. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-03-11 01:30:42.347079: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:568] function_optimizer failed: Not found: Op type not registered 'RegexSplitWithOffsets' in binary running on bd61380e8e17. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-03-11 01:30:42.445530: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at partitioned_function_ops.cc:113 : Not found: Op type not registered 'RegexSplitWithOffsets' in binary running on bd61380e8e17. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
SimZhoucommented, Jul 7, 2022

Hi @SimZhou I’m facing a similar issue, were you able to resolve this?

Unfortunately No. But there are 2 possible alternatives instead of embedding the custom op into triton:

  1. Use python backend to do text-to-vector transformation and make it as an api available in Triton. Then everytime befor you do tasks, request the api for vector first.
  2. just do text-to-vector operation yourself.
0reactions
tanmayv25commented, Aug 20, 2022

We have several customers already deployed tensorflow-text models successfully in Triton with LD_PRELOAD. As described in the linked issue, you have to make sure the version of tensorflow being used in Triton matches with the version of tensorflow-text you are pulling the custom ops from. You can look at my response here to learn more: https://github.com/triton-inference-server/server/issues/3604#issuecomment-982125998

The standard pip install tensorflow-text in python 3.8 installs tensorflow libs with 2.x TF versions. When launching tritonserver, you would have to provide the --backend-config=tensorflow,version=2 to use 2.x TF version. For 22.07, TF version in Triton containers are: 2.9.1 and 1.15.5 So, you should be copying _regex_split_ops.so from tensorflow-text==2.9.1 or tensorflow-text==1.15.5, to Triton container image.

Then you should launch Triton server as: If using TensorFlow v1:

export LD_LIBRARY_PATH=/opt/tritonserver/backends/tensorflow1:$LD_LIBRARY_PATH
LD_PRELOAD=/<path_to_custom_ops_from_tensorflow-text==1.15.5>/_regex_split_ops.so tritonserver --model-store=my_model/ --backend-config=tensorflow,version=1 

If using TensorFLow v2:

export LD_LIBRARY_PATH=/opt/tritonserver/backends/tensorflow2:$LD_LIBRARY_PATH
LD_PRELOAD=/<path_to_custom_ops_from_tensorflow-text==2.9.1>/_regex_split_ops.so tritonserver --model-store=my_model/
--backend-config=tensorflow,version=2 

The issue in #4212 is :

ERROR: ld.so: object '_regex_split_ops.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

It just means _regex_split_ops.so was not in the path as the system was not able to find it. Looks like the entire ops directory was copied to the image. Hence,

--env LD_PRELOAD=ops/_regex_split_ops.so \

should have solved the issue. That being said a lot of tensorflow-text models are supported in Triton via LD_PRELOAD and in production use by lots of Triton’s user.

Closing the issue to avoid future confusion. Please open a new GH issue if you are running into any other problem with the integration.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deploying Models from TensorFlow Model Zoo Using NVIDIA ...
Triton allows you to use the TensorFlow Graphdef file directly. These are the detailed steps for deploying the TensorFlow frozen GraphDef ...
Read more >
Run TensorFlow Training Jobs with SageMaker Training ...
SageMaker Training Compiler for TensorFlow is available through the ... For example, there's a known issue that the tensorflow/models and tensorflow/text ...
Read more >
Building a scalable TensorFlow inference system using Triton ...
Get the endpoint URL of the Prometheus service. Copy the endpoint because you use it to configure Grafana in the steps that follow....
Read more >
Text Classification - TensorFlow - Amazon SageMaker
The Text Classification - TensorFlow algorithm supports transfer learning using any of the compatible pretrained TensorFlow models. For a list of all available ......
Read more >
V2 Inference Protocol - KServe Documentation Website
This protocol is endorsed by NVIDIA Triton Inference Server, TensorFlow Serving, ... The HTTP/REST API uses JSON because it is widely supported and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found