Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`tf.keras.layers.experimental.EinsumDense` gives different results depending on batch size

See original GitHub issue

Please go to TF Forum for help and support:

https://discuss.tensorflow.org/tag/keras

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead). The form below must be filled out.

Here’s why we have that policy:.

Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information.

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary):
TensorFlow version (use command below): 2.6.0 and 2.7.0
Python version: 3.7
Bazel version (if compiling from source):
GPU model and memory:
Exact command to reproduce:

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with: python -c “import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)”

Describe the problem.

the output of EinsumDense layer depends on number of examples in a batch.

The issue is rather important because attention implementation uses einsum dense layers.

Describe the current behavior

when using tf.keras.layers.experimental.EinsumDense layer produce different result depending on the batch size.

Describe the expected behavior

the output of the layer should be independent of batch dimension

Contributing.

Do you want to contribute a PR? (yes/no): no
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

import tensorflow as tf
dense = tf.keras.layers.experimental.EinsumDense(
        "bac,acd->bda",
        output_shape=[64,4],
        bias_axes="da"
)
x = tf.random.uniform((80,4,32))
print(dense(x)[0, :4].numpy())
print(dense(x[:1])[0, :4].numpy())

I’ve got different result in 7th-8th order:

[[ 0.12284548  0.2814498  -0.34291047  0.00810905]
 [ 0.23635934 -0.08506497  0.12073331  0.33535597]
 [-0.30136532  0.34854767  0.41540402  0.1328382 ]
 [-0.04340287 -0.07197566  0.17427945 -0.29642397]]

[[ 0.12284552  0.2814498  -0.34291047  0.00810904]
 [ 0.23635934 -0.08506497  0.12073331  0.33535597]
 [-0.3013654   0.3485477   0.415404    0.1328382 ]
 [-0.04340289 -0.07197568  0.17427945 -0.29642397]]

Having several of Einsum layers leads to accumulated error and wrong predictions

Similar behavior can be observed for the tf.einsum:

import tensorflow as tf
x = tf.random.uniform((80,4,32))
w = tf.random.uniform((4,32,64))
print(tf.einsum("bac,acd->bda", x, w)[0, :4].numpy())
print()
print(tf.einsum("bac,acd->bda", x[:1], w)[0, :4].numpy())

I’ve got different result in 7th-8th order:


[[8.515028  6.533536  8.1234455 6.415581 ]
 [8.387905  8.020177  8.997409  6.6566358]
 [7.485701  7.6112795 8.145813  7.514901 ]
 [8.022375  7.008651  7.9132357 5.918109 ]]

[[8.515028  6.5335355 8.123446  6.4155803]
 [8.387905  8.020177  8.997409  6.6566353]
 [7.485701  7.611279  8.145813  7.5149007]
 [8.022375  7.0086513 7.9132347 5.9181094]]

Source code / logs.

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

Issue Analytics

State:
Created 2 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

sanatmpa1commented, Dec 8, 2021

@Ghostvv,

Sure, I am closing this issue here and we can followup in the tensorflow issue. Thanks!

1reaction

sanatmpa1commented, Dec 6, 2021

@Ghostvv,

Please take a look at this comment. Thanks!

Top Results From Across the Web

`tf.keras.layers.experimental.EinsumDense` gives different ...

keras.layers.experimental.EinsumDense gives different results depending on batch size #53321.

Tensorflow Keras Different Inference Results Depending on ...

I believe it's commonly expected that batching can have some impact on results, with a risk of that impact being negative for larger...

tf.keras.layers.EinsumDense | TensorFlow v2.11.0

Here, the output_shape has two values (since there are two non-batch dimensions in the output); the first dimension in the output_shape is None ......

Release 2.12.0 - Google Git

Added group normalization layer tf.keras.layers. ... EinsumDense . ... The performance difference ranges from 8 to 100 times depending on the size of...

TensorFlow v2.10.0-rc2 Release - GitClear

Its import path is moved from tf.keras.layers.experimental. ... performance difference range from 8 to 100 times depending on the size of k.