Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

shape of example features don't match expectation when using example keras code

See original GitHub issue

Hello,

I’m trying to run the example code using the keras model and the tf_record data. The problem is that the shape of the example feature is (None, None, None) but expected to have only 2 dimensions.

It is probably something stupid. Hope somebody can point me towards the error.

Here is my example code to recreate the problem. It is mostly copy paste from the examples. (error msg is below)

from google.protobuf import text_format
from tensorflow_serving.apis import input_pb2
import os
import tensorflow as tf
import tensorflow_ranking as tfr


_FILE_NAME = "/tmp/ranking_example.tf_record"
_LABEL_FEATURE = "relevance"
_PADDING_LABEL = -1
_SIZE="example_list_size"


def create_feature_columns():
  sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
      key="user_id", hash_bucket_size=100, dtype=tf.int64)
  query_embedding = tf.feature_column.embedding_column(
      categorical_column=sparse_column, dimension=20)
  context_feature_columns = {"user_id": query_embedding}

  sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
      key="document_id", hash_bucket_size=100, dtype=tf.int64)
  document_embedding = tf.feature_column.embedding_column(
      categorical_column=sparse_column, dimension=20)
  example_feature_columns = {"document_id": document_embedding}

  return context_feature_columns, example_feature_columns


def make_dataset(file_pattern,
                 batch_size,
                 randomize_input=False,
                 num_epochs=None):
  context_feature_columns, example_feature_columns = create_feature_columns()
  context_feature_spec = tf.feature_column.make_parse_example_spec(
      context_feature_columns.values())
  label_column = tf.feature_column.numeric_column(
      _LABEL_FEATURE, dtype=tf.int64, default_value=_PADDING_LABEL)
  example_feature_spec = tf.feature_column.make_parse_example_spec(
      list(example_feature_columns.values()) + [label_column])
  dataset = tfr.data.build_ranking_dataset(
      file_pattern=file_pattern,
      data_format=tfr.data.ELWC,
      batch_size=batch_size,
      context_feature_spec=context_feature_spec,
      example_feature_spec=example_feature_spec,
      reader=tf.data.TFRecordDataset,
      shuffle=randomize_input,
      num_epochs=num_epochs,
      size_feature_name=_SIZE)

  def _separate_features_and_label(features):
    label = tf.squeeze(features.pop(_LABEL_FEATURE), axis=2)
    label = tf.cast(label, tf.float32)
    return features, label

  dataset = dataset.map(_separate_features_and_label)
  return dataset


def test_ranking_example():
    samples = text_format.Parse(
        """
        context {
          features {
            feature {
              key: "user_id"
              value { int64_list { value: 1 } }
            }
          }
        }
        examples {
          features {
            feature {
              key: "document_id"
              value { int64_list { value: 1 } }
            }
            feature {
              key: "relevance"
              value { int64_list { value: 1 } }
            }
          }
        }
        examples {
          features {
            feature {
              key: "document_id"
              value { int64_list { value: 2 } } 
            }
            feature {
              key: "relevance"
              value { int64_list { value: 0 } }
            }
          }
        }""", input_pb2.ExampleListWithContext())

    try:
        os.remove(_FILE_NAME)
    except FileNotFoundError:
        pass

    with tf.io.TFRecordWriter(_FILE_NAME) as writer:
        for sample in [samples]*6:
            writer.write(sample.SerializeToString())

    batch_size = 2
    dataset = make_dataset(_FILE_NAME, batch_size)
    context_feature_columns, example_feature_columns = create_feature_columns()
    # Use a Premade Network, or subclass and build your own!
    network = tfr.keras.canned.DNNRankingNetwork(
        context_feature_columns=context_feature_columns,
        example_feature_columns=example_feature_columns,
        hidden_layer_dims=[1024, 512, 256],
        activation=tf.nn.relu,
        dropout=0.5)

    softmax_loss_obj = tfr.keras.losses.get(tfr.losses.RankingLossKey.SOFTMAX_LOSS)

    # Contains all ranking metrics, including NDCG @ {1, 3, 5, 10}.
    default_metrics = tfr.keras.metrics.default_keras_metrics()

    ranker = tfr.keras.model.create_keras_model(
        network=network,
        loss=softmax_loss_obj,
        metrics=default_metrics,
        optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.05),
        size_feature_name=_SIZE)

    r = ranker.fit(
        dataset,
        steps_per_epoch=4,
        epochs=10
    )

    assert r

error:

test_ranking_all.py:132: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py:819: in fit
    use_multiprocessing=use_multiprocessing)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:235: in fit
    use_multiprocessing=use_multiprocessing)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:593: in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:706: in _process_inputs
    use_multiprocessing=use_multiprocessing)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/data_adapter.py:702: in __init__
    x = standardize_function(x)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:684: in standardize_function
    return dataset.map(map_fn, num_parallel_calls=dataset_ops.AUTOTUNE)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/data/ops/dataset_ops.py:1591: in map
    self, map_func, num_parallel_calls, preserve_cardinality=True)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/data/ops/dataset_ops.py:3926: in __init__
    use_legacy_function=use_legacy_function)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/data/ops/dataset_ops.py:3147: in __init__
    self._function = wrapper_fn._get_concrete_function_internal()
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py:2395: in _get_concrete_function_internal
    *args, **kwargs)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py:2389: in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py:2703: in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py:2593: in _create_graph_function
    capture_by_value=self._capture_by_value),
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/framework/func_graph.py:978: in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/data/ops/dataset_ops.py:3140: in wrapper_fn
    ret = _wrapper_helper(*args)
../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/data/ops/dataset_ops.py:3082: in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = ({'document_id': <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7f47446fded0>, 'example_list_size....framework.sparse_tensor.SparseTensor object at 0x7f4744716590>}, <tf.Tensor 'args_3:0' shape=(2, None) dtype=float32>)
kwargs = {}
options = <tensorflow.python.autograph.core.converter.ConversionOptions object at 0x7f4744716950>

    def wrapper(*args, **kwargs):
      """Wrapper that calls the converted version of f."""
      options = converter.ConversionOptions(
          recursive=recursive,
          user_requested=user_requested,
          optional_features=optional_features)
      try:
        return converted_call(f, args, kwargs, options=options)
      except Exception as e:  # pylint:disable=broad-except
        if hasattr(e, 'ag_error_metadata'):
>         raise e.ag_error_metadata.to_exception(e)
E         ValueError: in converted code:
E         
E             /content-recommendations/.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:677 map_fn
E                 batch_size=None)
E             /content-recommendations/.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py:2410 _standardize_tensors
E                 exception_prefix='input')
E             content-recommendations/.venv/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_utils.py:573 standardize_input_data
E                 'with shape ' + str(data_shape))
E         
E             ValueError: Error when checking input: expected document_id to have 2 dimensions, but got array with shape (None, None, None)

../../../.venv/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py:237: ValueError

pip list:

Package                Version    
---------------------- -----------
absl-py                0.9.0      
alembic                1.4.2      
aniso8601              7.0.0      
appdirs                1.4.3      
astor                  0.8.1      
attrs                  19.3.0     
black                  19.10b0    
bleach                 3.1.4      
boto3                  1.9.253    
botocore               1.12.253   
cached-property        1.5.1      
cachetools             4.1.0      
certifi                2020.4.5.1 
cfgv                   3.1.0      
chardet                3.0.4      
click                  7.1.1      
coloredlogs            14.0       
contextlib2            0.6.0.post1
cycler                 0.10.0     
dagit                  0.7.7      
dagster                0.7.7      
dagster-aws            0.7.7      
dagster-cron           0.7.7      
dagster-graphql        0.7.7      
dagster-pandas         0.7.7      
decorator              4.4.2      
defusedxml             0.6.0      
distlib                0.3.0      
docutils               0.15.2     
entrypoints            0.3        
filelock               3.0.12     
Flask                  1.1.2      
Flask-Cors             3.0.8      
Flask-GraphQL          2.0.1      
Flask-Sockets          0.2.1      
funcsigs               1.0.2      
future                 0.18.2     
gast                   0.2.2      
gevent                 20.4.0     
gevent-websocket       0.10.1     
google-auth            1.14.0     
google-auth-oauthlib   0.4.1      
google-pasta           0.2.0      
graphene               2.1.8      
graphql-core           2.3.1      
graphql-relay          2.0.1      
graphql-server-core    1.2.0      
graphql-ws             0.3.0      
graphviz               0.14       
greenlet               0.4.15     
grpcio                 1.28.1     
h5py                   2.10.0     
humanfriendly          8.2        
identify               1.4.15     
idna                   2.9        
importlib-metadata     1.6.0      
ipython-genutils       0.2.0      
itsdangerous           1.1.0      
Jinja2                 2.11.2     
jmespath               0.9.5      
jsonschema             3.2.0      
jupyter-core           4.6.3      
Keras-Applications     1.0.8      
Keras-Preprocessing    1.1.0      
kiwisolver             1.2.0      
Mako                   1.1.2      
Markdown               3.2.1      
MarkupSafe             1.1.1      
matplotlib             3.2.1      
mistune                0.8.4      
more-itertools         8.2.0      
nbconvert              5.6.1      
nbformat               5.0.6      
nodeenv                1.3.5      
numpy                  1.18.3     
oauthlib               3.1.0      
opt-einsum             3.2.1      
packaging              20.3       
pandas                 1.0.3      
pandocfilters          1.4.2      
pathspec               0.8.0      
pathtools              0.1.2      
pip                    20.0.2     
pip-tools              5.0.0      
pkg-resources          0.0.0      
pluggy                 0.13.1     
pre-commit             2.3.0      
promise                2.3        
protobuf               3.11.3     
psycopg2-binary        2.8.5      
py                     1.8.1      
pyarrow                0.17.0     
pyasn1                 0.4.8      
pyasn1-modules         0.2.8      
Pygments               2.6.1      
PyMySQL                0.9.3      
pyparsing              2.4.7      
pyrsistent             0.16.0     
pytest                 5.4.1      
python-crontab         2.4.1      
python-dateutil        2.8.1      
python-editor          1.0.4      
pytz                   2019.3     
PyYAML                 5.3.1      
regex                  2020.4.4   
requests               2.23.0     
requests-oauthlib      1.3.0      
rsa                    4.0        
Rx                     1.6.1      
s3transfer             0.2.1      
scipy                  1.4.1      
setuptools             46.1.3     
six                    1.14.0     
SQLAlchemy             1.3.16     
tensorboard            2.1.1      
tensorflow             2.1.0      
tensorflow-estimator   2.1.0      
tensorflow-ranking     0.3.0      
tensorflow-serving-api 2.1.0      
termcolor              1.1.0      
terminaltables         3.1.0      
testpath               0.4.4      
toml                   0.10.0     
toposort               1.5        
tqdm                   4.45.0     
traitlets              4.3.3      
typed-ast              1.4.1      
urllib3                1.25.9     
virtualenv             20.0.18    
watchdog               0.10.2     
wcwidth                0.1.9      
webencodings           0.5.1      
Werkzeug               1.0.1      
wheel                  0.34.2     
wrapt                  1.12.1     
zipp                   3.1.0

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

filthysockscommented, May 27, 2020

@filthysocks curious: why did you choose making your document_id as a categorical_column_with_hash_bucket column VS numeric column ?

well since its categorical data and tensorflow only offers here three different solutions (see https://www.tensorflow.org/tutorials/structured_data/feature_columns#categorical_columns). A numeric column wouldn’t make much sense it would imply that there is somehow a meaning in the id value itself (e.g. 3x the ID = 3x better).

I came across the same error trying to fit the keras model on my own data. I dug around and traced the bug to this line of code https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/python/keras/feature.py#L61 shape = [list_size] + list(spec.shape) if not is_sparse else (1,). I changed my local source to shape = [list_size] + list(spec.shape) if not is_sparse else (list_size, None) and it ran. @ramakumar1729 Looking forward to your validation and possibly a quick bug fix release. Thx!

cool. hope it’ll work

1reaction

ramakumar1729commented, May 27, 2020

I came across the same error trying to fit the keras model on my own data. I dug around and traced the bug to this line of code https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/python/keras/feature.py#L61 shape = [list_size] + list(spec.shape) if not is_sparse else (1,). I changed my local source to shape = [list_size] + list(spec.shape) if not is_sparse else (list_size, None) and it ran. @ramakumar1729 Looking forward to your validation and possibly a quick bug fix release. Thx!

@yzhangswingman : Thanks for finding the root cause of the issue! This needs fixing, and I’ll update it in the next release.

Top Results From Across the Web

How to Make Predictions with Keras - Machine Learning Mastery

Once you choose and fit a final deep learning model in Keras, you can use it to make predictions on new data instances....

Introduction to Keras for Engineers

Introduction. Are you a machine learning engineer looking to use Keras to ship deep-learning powered features in real products?

Change input shape dimensions for fine-tuning with Keras

In this tutorial, you will learn how to change the input shape tensor dimensions for fine-tuning using Keras. After going through this guide ......

How to determine input shape in keras?

The number of rows in your training data is not part of the input shape of the network because the training process feeds...

A Practical Tutorial With Examples for Images and Text in Keras

Transfer learning is about leveraging feature representations from a pre-trained model, so you don't have to train a new model from scratch.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

shape of example features don't match expectation when using example keras code

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Non-deterministic results in tf_ranking_tfrecord.py

When using ELWC - OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input