question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TF: No way to create a string tensor with arbitrary binary data?

See original GitHub issue

Description

I’m trying to feed a Google AutoML Object detection saved_model.pb, which accepts a JPEG/PNG-encoded file as DT_STRING(-1) tensor on the input. It worked well in Java TensorFlow, but I’m struggling to make it work in DJL. Can’t find a way to create binary tensor with type String. After few hours of reading DJL code, tried the following, without success:

    @Override
    public NDList processInput(TranslatorContext ctx, MatOfByte input) {
        NDArray imageBytes = ctx.getNDManager().create(ByteBuffer.wrap(input.toArray()), new Shape(), DataType.STRING);
        imageBytes.setName("image_bytes");

        NDArray key = ctx.getNDManager().create("test");
        key.setName("key");

        return new NDList(imageBytes, key);
    }

The same saved_model.pb works fine from Java TensorFlow API, but with DJL TensorFlow complains, and the tensor handle in imageBytes seems to have tensor with address = 0 which seems not right to me (see screenshot).

idea64_a63JYaAaR7
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index 0 of dimension 0 out of bounds.
org.tensorflow.exceptions.TFInvalidArgumentException: slice index 0 of dimension 0 out of bounds.

The shape of the AutoML model:

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['image_bytes'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: encoded_image_string_tensor:0
    inputs['key'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: key:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['detection_boxes'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 40, 4)
        name: detection_boxes:0
    outputs['detection_classes'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 40)
        name: detection_classes:0
    outputs['detection_classes_as
<img width="651" alt="idea64_a63JYaAaR7" src="https://user-images.githubusercontent.com/531058/134743318-cfd700ab-4aed-4721-8346-661f2104df5e.png">
_text'] tensor_info:
        dtype: DT_STRING
        shape: (-1, -1)
        name: detection_classes_as_text:0
    outputs['detection_multiclass_scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 40, 3)
        name: detection_multiclass_scores:0
    outputs['detection_scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 40)
        name: detection_scores:0
    outputs['image_info'] tensor_info:
        dtype: DT_INT32
        shape: (-1, 6)
        name: Tile_1:0
    outputs['key'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: Identity:0
    outputs['num_detections'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1)
        name: num_detections:0
  Method name is: tensorflow/serving/predict

Working code from Java TensorFlow API to create the input tensors:

TString inputTensor = TString.tensorOfBytes(NdArrays.vectorOfObjects(buffer.toArray()));
TString keyTensor = TString.scalarOf("test");

Expected Behavior

There is a way to create a binary string tensor, as it’s possible with TensorFlow Java, Python and other APIs

Error Message

 org.tensorflow.exceptions.TFInvalidArgumentException: slice index 0 of dimension 0 out of bounds.
	 [[{{node map/TensorArrayUnstack/strided_slice}}]]
	at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
	at ai.djl.tensorflow.engine.javacpp.JavacppUtils.runSession(JavacppUtils.java:192)
	at ai.djl.tensorflow.engine.TfSymbolBlock.forwardInternal(TfSymbolBlock.java:131)
	at ai.djl.nn.AbstractBlock.forward(AbstractBlock.java:121)
	at ai.djl.nn.Block.forward(Block.java:122)
	at ai.djl.inference.Predictor.predict(Predictor.java:123)
	at ai.djl.inference.Predictor.batchPredict(Predictor.java:150)
	at ai.djl.inference.Predictor.predict(Predictor.java:118)

What have you tried to solve it?

Spent hours of reading the code of TfNDManager, JavacppUtils etc. to try to understand how to do it.

Environment Info

DJL version: 0.12.0

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
frankfliucommented, Sep 26, 2021

@madprogrammer Thanks for your inputs.

Now you can create String tensor using either charset or use ByteBuffer:

        TfNDManager manager = ((TfNDManager)ctx.getNDManager());
        NDArray imageBytes = manager.createStringTensor(new Shape(1), ByteBuffer.wrap(input));
        imageBytes.setName("image_bytes");

The ByteBuffer only available in TfNDManager.

0reactions
frankfliucommented, Sep 28, 2021

This is fixed by https://github.com/deepjavalibrary/djl/pull/1251. You can try 0.13.0-SNAPSHOT version now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

tf.data: Build TensorFlow input pipelines
For example, in an image pipeline, an element might be a single training example, with a pair of tensor components representing the image...
Read more >
Tensorflow question: serialization of tensor in order to write it ...
1 Answer 1 · tf.io.serialize_tensor() and tf.io.parse_tensor() for arbitrary tensors. · tf.io.encode_png() (or tf.io.encode_jpeg() ) and tf.io.
Read more >
Using Datasets with TensorFlow - Hugging Face
This document is a quick introduction to using datasets with TensorFlow, with a particular focus on how to get tf.Tensor objects out of...
Read more >
Using TFRecords and tf.Example - | notebook.community
The simplest way to handle non-scalar features is to use tf.serialize_tensor to convert tensors to binary-strings. Strings are scalars in tensorflow.
Read more >
TFRecords Basics - Kaggle
A TFRecord is a kind of file that TensorFlow uses to store binary data. ... We can use tf.io.serialize_tensor to turn a tensor...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found