Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exporting finetuned model to SavedModel format for Tensorflow Serving

See original GitHub issue

Hi,

Thanks for the excellent work on repo. I was able to train and finetune a custom model using this. Also able to test the model with checkpoints successfully. However, my need is to use the model in Tensorflow Serving, hence the requirement of SavedModel to avoid any issues.

So far, I managed to test using Checkpoint using below code:

datadir = os.path.join('/home/ubuntu/mayub/lm_training/elmo', 'finetuned_model')
vocab_file = os.path.join(datadir, 'vocab-2016-09-10.txt')
options_file = os.path.join(datadir, 'options.json')
weight_file = os.path.join(datadir, 'finetune_model_weights.hdf5')

# Create a Batcher to map text to character ids.
batcher2 = Batcher(vocab_file, 50)
 
# Input placeholders to the biLM.
context_character_ids2 = tf.placeholder('int32', shape=(None, None, 50))
 
# Build the biLM graph.
bilm2 = BidirectionalLanguageModel(options_file, weight_file)
 
# Get ops to compute the LM embeddings.
context_embeddings_op2 = bilm2(context_character_ids2)
 
# Get an op to compute ELMo (weighted average of the internal biLM layers)
elmo_context_input2 = weight_layers('input', context_embeddings_op2, l2_coef=0.0)

## Run the Inference with TF Session
with tf.Session() as sess:
    # It is necessary to initialize variables once before running inference.
    sess.run(tf.global_variables_initializer())
 
    # Create batches of data. `tokenized_context` has [list [list(text tokens) ]]
    context_ids = batcher2.batch_sentences(tokenized_context)
    print("Shape of context ids = ", context_ids.shape)
 
    # Compute ELMo representations (here for the input only, for simplicity).
    elmo_context_input_ = sess.run(
        elmo_context_input2['weighted_op'],
        feed_dict={context_character_ids2: context_ids}
    )
    
print("Shape of generated embeddings = ",elmo_context_input_.shape)
# Output:
# Shape of context ids =  (3, 14, 50)
# Shape of generated embeddings =  (3, 12, 1024

Basic outline on how the load and save a builder object, using below code:

tf.reset_default_graph()
saver = tf.train.import_meta_graph('/path/to/checkpoint/meta_file.meta')
builder = tf.saved_model.builder.SavedModelBuilder('/path/to/output/dir/')
with tf.Session(config=tf.ConfigProto(allow_soft_placement=True,  log_device_placement=True)) as sess:
        # Restore variables from disk.
        saver.restore(sess, '/path/to/finetuned_model/dir/')
        print("Model restored.")
        builder.add_meta_graph_and_variables(sess, ['custom_tag'],strip_default_attrs=False)
builder.save()

I’m slightly confused on how my Signature Def should look like and how to account for any other pre-processing operations and layers etc.

I want it to be something similar to what Tf Hub-ELMO 3 has or atleast support the following:

Inputs either text or tokens
Outputs: character-based word representations and weighted sum of the 3 layers,

Any help appreciated. Thanks in advance!

@matt-peters

Issue Analytics

State:
Created 3 years ago
Comments:6

Top GitHub Comments

1reaction

carolmandersoncommented, Jul 26, 2020

@mohammedayub44 ah, ok. In that case, you can export the model as described in #107 and reload it in Tensorflow 2 within your Streamlit app. Here’s sample code (caveat: I haven’t run this in a Streamlit app. But I have confirmed it works in Tensorflow 2):

import tensorflow as tf

from bilm import Batcher

# reload the model
loaded = tf.saved_model.load("/path/to/saved/model") # this is a directory. Don't include the file itself in the path. 
infer = loaded.signatures["serving_default"]

# get the char ids for your documents
vocab_file = '/path/to/my_vocab.txt'
batcher = Batcher(vocab_file, 50)
char_ids = batcher.batch_sentences([["Hello", "world"]])
char_ids = char_ids.astype('int32')    # must be cast to int32 before feeding to model

# get embeddings
embs = infer(tf.constant(char_ids))['import/concat_3:0']

Don’t be alarmed if you see this message: INFO:tensorflow:Saver not created because there are no variables in the graph to restore. This is expected.

Regarding the output size, you’ll get a 3 x 1024 tensor for every token in your input. So long documents or large batches can both cause large outputs.

0reactions

mohammedayub44commented, Jul 27, 2020

No problem. It works smoothly in Tensorflow 2. Guess I will skip the serving part for now as loading natively works better for me using Streamlit.

Top Results From Across the Web

Using the SavedModel format | TensorFlow Core

For a quick introduction, this section exports a pre-trained Keras model and serves image classification requests with it. The rest of the guide...

Use BERT fine-tuned model for Tensorflow serving · Issue #146

The method bypasses serialization and deserialization related with tf.InputExample and the generated SavedModel accepts numpy arrays as inputs ...

2. Exporting and deploying a model — IPU TensorFlow ...

After the model is trained, it can be exported for TensorFlow Serving. The export steps consist ... Models are exported into a standard...

Exporting fine-tuned saved model to TensorFlow Lite error

Hi，I'm doing the same work and encountered the same error, but I sovled it. The model I converted is SSD-Mobile-v2, and I'm using ......

Serving an Image Classification Model with Tensorflow Serving

It supports serving multiple versions of multiple models via gRPC and REST protocols. However, it requires the models to be in Tensorflow's SavedModel...