question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clarity issues when creating keras raw serving signature with multipe inputs hosted on ai platform prediction service

See original GitHub issue

Thanks in advance for any guidance on this issue and I apologize if I am missing something in the docs. However, despite attempting related solutions (e.g., #1108, #1885, #1906 etc.) I have failed to successfully create two export signature defs one that would allow of raw text predictions via ai platform’s prediction service and one that would allow for the examplegen component example output (example_gen.outputs[‘examples’]) to be used for model evaluation in the evaluator component.

For reference, I am following the taxi example closely, with the main difference being that I am pulling my own data via BigQuery with BiqQueryExampleGen.

dependencies

  1. tfx[kfp]==0.30.0
  2. python 3.7

My latest attempt follows this solution which integrates a separate export signature for raw data via a MyModule(tf.Module) class. Similar to the author @jason-brian-anderson, we avoided the use of tf.reshape because we use the _fill_in_missing operation in our preprocessing_fn which expects and parses sparsetensors. Below is the code embedded within the scope of the run_fn.

class MyModule(tf.Module):
      def __init__(self, model, tf_transform_output):
          self.model = model
          self.tf_transform_output = tf_transform_output
          self.model.tft_layer = self.tf_transform_output.transform_features_layer()

      @tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')])
      def serve_tf_examples_fn(self, serialized_tf_examples):
          feature_spec = self.tf_transform_output.raw_feature_spec()
          feature_spec.pop(features.LABEL_KEY) 
          parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
          transformed_features = self.model.tft_layer(parsed_features)
          return self.model(transformed_features)

      @tf.function(input_signature=[tf.TensorSpec(shape=(None), dtype=tf.string, name='raw_data')])
      def tf_serving_raw_input_fn(self, raw_data):
          raw_data_sp_tensor = tf.sparse.SparseTensor(
              indices=[[0, 0]],
              values=raw_data,
              dense_shape=(1, 1)
          )
          parsed_features = {'raw_data': raw_data_sp_tensor, }
          transformed_features = self.model.tft_layer(parsed_features)
          return self.model(transformed_features)

  module = MyModule(model, tf_transform_output)

  signatures = {"serving_default": module.serve_tf_examples_fn,
                "serving_raw_input": module.tf_serving_raw_input_fn,

                }
  tf.saved_model.save(module,
                      export_dir=fn_args.serving_model_dir,
                      signatures=signatures,
                      options=None,
                      )

The keras model expects 3 inputs: 1 DENSE_FLOAT_FEATURE_KEY and 2 VOCAB_FEATURE_KEYS.

The error I am currently experiencing is “can only concatenate str (not “SparseTensor”) to str” which is occurring at parsed_features = {‘raw_data’: raw_data_sp_tensor, }

I also attempted to manually create the feature spec re:(https://github.com/tensorflow/tfx/pull/1906/files) but there were naming convention issues and I was unable to expose schema2tensorspec to ensure similar expected input names/data types.

Any and all help is welcome. I am posting this issue as a docs issue because there does not seem to be any consensus or of official documentation regarding creating a raw input signature def with multiple inputs.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:17 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
ConverJenscommented, Aug 10, 2021

@yrianderreumaux Thanks for the clarification, now I understand the issue!

As far as I can tell, your first serving fn can be used for inputing tf Examples, hence it should work with the evaluator.

As for serving json strings directly: depending on how you are hosting your model, this might not be an issue. If your are using tf serving, it supplies a http port for REST calls in json format and the parsing to tf Example is done server side: https://www.tensorflow.org/tfx/serving/api_rest#start_modelserver_with_the_rest_api_endpoint. In this case it shoud be sufficient with the first serving fn only.

If you are hosting in some other way (kfserving, a custom service etc) you will need to change your second serving fn to handle that mapping from json string to tf example. Perhaps something similar to (in your def tf_serving_raw_input_fn(self, raw_data) fn):

# Disclaimer: this code is completely untested and most likely has some issue.
from tfx.components.example_gen import utils as example_gen_utils
...
json_dict = json.loads(json_string) # json to dict
parsed_features = example_gen_utils.dict_to_example(json_dict) # TFX contains a utility fn for converting a dict to (unserialized) tf Example
transformed_features = self.model.tft_layer(parsed_features)

But apart from that I would raise some concerns as to NOT using tf Examples for serving:

  1. protobuf is faster than http and it is also smaller in size
  2. when using a model with a tft layer the data needs to in tf Example format. Serialization needs to happen, being on the client or server side. But note that this is generally a cheap operation and not usually something that needs to be avoided.
  3. json is also not type safe in anyway compared to protobuf. You would rather have your serving failing on the client side on single data points than mess up an entire batch server side due to one bad data point.

But note that the same util method I used in the code example above can be used client side to easily build tf Examples from your json.

1reaction
yrianderreumauxcommented, Aug 9, 2021

Hi @ConverJens,

I really appreciate you taking the time to respond to my question as well as clarifying the different features of the serving fn. I am clearly new to TFX and I realize that I may have added confusion (both to myself and to others) when I used the term “raw data” and so I will attempt to clarify what our goal is with the two serving functions.

In our pipeline, ExampleGen pulls in data from BigQuery using BigQueryExampleGen. Thus “raw data” refers to JSON data. The ExampleGen component splits this data for training and evaluation, which it emits as tf.example records. The tf.example records are then fed into TFT, which preprocesses our data (e.g., computing a vocabulary) before it is fed into the model.

The motivation for having two separate serving functions is that we would like to have one that is used in the evaluator component that expects tf.example records, and a second that is used when the model is served that expects JSON data. More specifically, we would like to define this second signature in a way that would allow a tf transform coupled model to be hosted on ai platform’s prediction service that expects JSON data as well as serializes and transforms this incoming data consistent with the model’s expected inputs. That way, we would not have to create a full serialized tf.example on the client side prior to submission.

In sum, I am struggling to create a second signature that takes json data with multiple different data types coming from BigQuery into a digestible format to be sent to our prediction api (not training).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Keras Tutorial produces model that does not integrate with the ...
Clarity issues when creating keras raw serving signature with multipe inputs hosted on ai platform prediction service #4097.
Read more >
Get predictions from a custom trained model | Vertex AI
This page shows you how to get online (real-time) predictions and batch predictions from your custom trained models using the Google Cloud console...
Read more >
Error: "serving_default not found in signature def" when testing ...
I presume the input format doesn't represent the expected input, but am not entirely should what should be expected. Any ideas as to...
Read more >
A Comprehensive Guide on How to Monitor Your Models in ...
You can monitor what could go wrong with your machine learning model in production at two levels: Functional level monitoring – monitoring model ......
Read more >
TensorFlow 2 Tutorial: Get Started in Deep Learning with tf.keras
Between 2015 and 2019, developing deep learning models using ... XLA service 0x7fde3f2e6180 executing computations on platform Host.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found