Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clarity issues when creating keras raw serving signature with multipe inputs hosted on ai platform prediction service

See original GitHub issue

Thanks in advance for any guidance on this issue and I apologize if I am missing something in the docs. However, despite attempting related solutions (e.g., #1108, #1885, #1906 etc.) I have failed to successfully create two export signature defs one that would allow of raw text predictions via ai platform’s prediction service and one that would allow for the examplegen component example output (example_gen.outputs[‘examples’]) to be used for model evaluation in the evaluator component.

For reference, I am following the taxi example closely, with the main difference being that I am pulling my own data via BigQuery with BiqQueryExampleGen.

dependencies

tfx[kfp]==0.30.0
python 3.7

My latest attempt follows this solution which integrates a separate export signature for raw data via a MyModule(tf.Module) class. Similar to the author @jason-brian-anderson, we avoided the use of tf.reshape because we use the _fill_in_missing operation in our preprocessing_fn which expects and parses sparsetensors. Below is the code embedded within the scope of the run_fn.

class MyModule(tf.Module):
      def __init__(self, model, tf_transform_output):
          self.model = model
          self.tf_transform_output = tf_transform_output
          self.model.tft_layer = self.tf_transform_output.transform_features_layer()

      @tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')])
      def serve_tf_examples_fn(self, serialized_tf_examples):
          feature_spec = self.tf_transform_output.raw_feature_spec()
          feature_spec.pop(features.LABEL_KEY) 
          parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
          transformed_features = self.model.tft_layer(parsed_features)
          return self.model(transformed_features)

      @tf.function(input_signature=[tf.TensorSpec(shape=(None), dtype=tf.string, name='raw_data')])
      def tf_serving_raw_input_fn(self, raw_data):
          raw_data_sp_tensor = tf.sparse.SparseTensor(
              indices=[[0, 0]],
              values=raw_data,
              dense_shape=(1, 1)
          )
          parsed_features = {'raw_data': raw_data_sp_tensor, }
          transformed_features = self.model.tft_layer(parsed_features)
          return self.model(transformed_features)

  module = MyModule(model, tf_transform_output)

  signatures = {"serving_default": module.serve_tf_examples_fn,
                "serving_raw_input": module.tf_serving_raw_input_fn,

                }
  tf.saved_model.save(module,
                      export_dir=fn_args.serving_model_dir,
                      signatures=signatures,
                      options=None,
                      )

The keras model expects 3 inputs: 1 DENSE_FLOAT_FEATURE_KEY and 2 VOCAB_FEATURE_KEYS.

The error I am currently experiencing is “can only concatenate str (not “SparseTensor”) to str” which is occurring at parsed_features = {‘raw_data’: raw_data_sp_tensor, }

I also attempted to manually create the feature spec re:(https://github.com/tensorflow/tfx/pull/1906/files) but there were naming convention issues and I was unable to expose schema2tensorspec to ensure similar expected input names/data types.

Any and all help is welcome. I am posting this issue as a docs issue because there does not seem to be any consensus or of official documentation regarding creating a raw input signature def with multiple inputs.

Issue Analytics

State:
Created 2 years ago
Comments:17 (7 by maintainers)

Top GitHub Comments

2reactions

ConverJenscommented, Aug 10, 2021

@yrianderreumaux Thanks for the clarification, now I understand the issue!

As far as I can tell, your first serving fn can be used for inputing tf Examples, hence it should work with the evaluator.

As for serving json strings directly: depending on how you are hosting your model, this might not be an issue. If your are using tf serving, it supplies a http port for REST calls in json format and the parsing to tf Example is done server side: https://www.tensorflow.org/tfx/serving/api_rest#start_modelserver_with_the_rest_api_endpoint. In this case it shoud be sufficient with the first serving fn only.

If you are hosting in some other way (kfserving, a custom service etc) you will need to change your second serving fn to handle that mapping from json string to tf example. Perhaps something similar to (in your def tf_serving_raw_input_fn(self, raw_data) fn):

# Disclaimer: this code is completely untested and most likely has some issue.
from tfx.components.example_gen import utils as example_gen_utils
...
json_dict = json.loads(json_string) # json to dict
parsed_features = example_gen_utils.dict_to_example(json_dict) # TFX contains a utility fn for converting a dict to (unserialized) tf Example
transformed_features = self.model.tft_layer(parsed_features)

But apart from that I would raise some concerns as to NOT using tf Examples for serving:

protobuf is faster than http and it is also smaller in size
when using a model with a tft layer the data needs to in tf Example format. Serialization needs to happen, being on the client or server side. But note that this is generally a cheap operation and not usually something that needs to be avoided.
json is also not type safe in anyway compared to protobuf. You would rather have your serving failing on the client side on single data points than mess up an entire batch server side due to one bad data point.

But note that the same util method I used in the code example above can be used client side to easily build tf Examples from your json.

1reaction

yrianderreumauxcommented, Aug 9, 2021

Hi @ConverJens,

I really appreciate you taking the time to respond to my question as well as clarifying the different features of the serving fn. I am clearly new to TFX and I realize that I may have added confusion (both to myself and to others) when I used the term “raw data” and so I will attempt to clarify what our goal is with the two serving functions.

In our pipeline, ExampleGen pulls in data from BigQuery using BigQueryExampleGen. Thus “raw data” refers to JSON data. The ExampleGen component splits this data for training and evaluation, which it emits as tf.example records. The tf.example records are then fed into TFT, which preprocesses our data (e.g., computing a vocabulary) before it is fed into the model.

The motivation for having two separate serving functions is that we would like to have one that is used in the evaluator component that expects tf.example records, and a second that is used when the model is served that expects JSON data. More specifically, we would like to define this second signature in a way that would allow a tf transform coupled model to be hosted on ai platform’s prediction service that expects JSON data as well as serializes and transforms this incoming data consistent with the model’s expected inputs. That way, we would not have to create a full serialized tf.example on the client side prior to submission.

In sum, I am struggling to create a second signature that takes json data with multiple different data types coming from BigQuery into a digestible format to be sent to our prediction api (not training).

Top Results From Across the Web

Keras Tutorial produces model that does not integrate with the ...

Clarity issues when creating keras raw serving signature with multipe inputs hosted on ai platform prediction service #4097.

Get predictions from a custom trained model | Vertex AI

This page shows you how to get online (real-time) predictions and batch predictions from your custom trained models using the Google Cloud console...

Error: "serving_default not found in signature def" when testing ...

I presume the input format doesn't represent the expected input, but am not entirely should what should be expected. Any ideas as to...

A Comprehensive Guide on How to Monitor Your Models in ...

You can monitor what could go wrong with your machine learning model in production at two levels: Functional level monitoring – monitoring model ......

TensorFlow 2 Tutorial: Get Started in Deep Learning with tf.keras

Between 2015 and 2019, developing deep learning models using ... XLA service 0x7fde3f2e6180 executing computations on platform Host.