Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Not able to train DCN model with float values

See original GitHub issue

I am trying to train a DCN model for LTR purpose with 2 categorical features and 3 numerical features of float type. Is there a way to embed float features like how we have IntergerLookup and StringLookup for int and str respectively? I tried to pass the feature using IntegerLookup itself but I get the below error

Code:

tf.keras.Sequential([ tf.keras.layers.IntegerLookup(vocabulary=<float feature>, mask_token=None),
      tf.keras.layers.Embedding(len<float feature> + 1, 64)])

Error:
Error:FailedPreconditionError                   Traceback (most recent call last)
[<ipython-input-48-4c713ba09dd5>] in <module>()
      1 tf.keras.Sequential([
----> 2       tf.keras.layers.IntegerLookup(vocabulary=<feature>, mask_token=None),
      3       tf.keras.layers.Embedding(len(vocabularies[feat]) + 1, 64)])

12 frames
[/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   7184 def raise_from_not_ok_status(e, name):
   7185   e.message += (" name: " + name if name is not None else "")
-> 7186   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   7187 
   7188 

FailedPreconditionError: HashTable has different value for same key. Key 0 has 1 and trying to add value 2 [Op:LookupTableImportV2]

Issue Analytics

State:
Created 2 years ago
Comments:5

Top GitHub Comments

1reaction

meg261995commented, Mar 8, 2022

Implemented bucketize. Totally working. Thankyou so much. I will get back if I get stuck again.

1reaction

patrickorlandocommented, Mar 7, 2022

Apparently it is not able to embed negative values

Correct. An embedding layer is just a matrix of size (n_categories, embed_size). It expects input as the row indexes of your categories. This is why you need some layer before your embedding layer which transforms your input into a consecutive set of indices.

If your feature vocab is [4,3,78,20], the integer lookup layer will convert these to [2,3,4,5]. This is assuming you have 1 unknown token and 1 mask token which would be mapped to indicies [0, 1] respectively.

The most important thing to understand is that if your float features are actually a continuous feature, you must use either of the methods I described above. It won’t make sense to treat them as categorical.

Top Results From Across the Web

Not able to embed float features in DCN model - Stack Overflow

I am trying to train a DCN model for LTR purpose with 2 categorical features and 3 numerical features of float type. Is...

Making floating point math highly efficient for AI hardware

Today, models are typically trained using floating point, but then they must be converted to a more efficient quantized format that can be ......

UnimplementedError: Cast string to float is not supported

If you are new to deep learning, you might encounter this error "UnimplementedError: Cast string to float is not supported" while training ......

1. Introduction — Mixed-Precision Arithmetic for AI

Historically, the training of state-of-the-art deep neural network models has relied on IEEE 754 single-precision 32-bit floating-point arithmetic.

Floating-Point Formats and Deep Learning - George Ho

Floating -point formats are not the most glamorous or (frankly) the important consideration when working with deep learning models: if your model ......