question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dense layer doesn't flatten higher dimensional tensors

See original GitHub issue

https://github.com/keras-team/keras/blob/aedad3986200b825d94f847d52bd6b81f0419a06/keras/layers/core.py#L776

The documentation of the dense layer claims to flatten the input if a tensor with rank > 2 is provided. However, what actually happens is that the the dense layer picks the last dimension and computes the result element wise along remaining axis.

https://github.com/keras-team/keras/blob/aedad3986200b825d94f847d52bd6b81f0419a06/keras/layers/core.py#L858

You can verify that by comparing two models one adding a Flatten() layer and the other one not adding one: https://gist.github.com/FirefoxMetzger/44e9e056e45c1a3cc8000ab8d6f2cebe

The first model only has 10 + bias = 11 trainable parameters (reusing weights along the 1st input dimension). The second model has 10*10 + bias = 101 trainable parameters. Also the output shapes are completely different. I would have expected the result to be indifferent wrt. the Flatten() layer…

It might very well be that I am misunderstanding something. If so, kindly point out my mistake =)

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:9 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
bethardcommented, May 2, 2018

I just want to chime in that I was also confused by this documentation, as was this StackOverflow user: https://stackoverflow.com/questions/44611006/timedistributeddense-vs-dense-in-keras-same-number-of-parameters

If nothing else, I would really appreciate it if the note explicitly stated that “flatten” here means something different from the Flatten layer. Ideally, the documentation would give an example of an input of some shape and how that is “flattened” to produce the output shape.

0reactions
rmanakcommented, Jul 14, 2018

I am also confused with this, However applying a dense layer D[k,l] (of shape (K, L)to each of the temporal components of an input X[?,m,k] (of shape (?, M, K)) is mathematically identical to matrix multiplication X * D. This is just a happy coincidence. However for TimeDistributed layer to work with arbitrary layer, keras needs to have a “for loop” implementation of this multiplication rather than full vectorized implementation.

If input was flattened to the shape (?, M*K) the layer needs to have dimension (M*K, L) and far more parameters, this does not “corresponds, conceptually, to a dot product with a flattened version of the input” but does correspond conceptually to a dot product with flattened version where there are in fact M different copies of the dense layer of shape (K, L) so the temporal component do not share the weights. Perhaps that is what they meant by conceptual equivalency.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding output of Dense layer for higher dimension
Save this question. Show activity on this post. I don't have problem in understanding output shape of a Dense layer followed by a...
Read more >
Should there be a flat layer in between the conv layers and ...
Not sure that still matters for your project but it is important: the Dense layer does not flatten the entry first!
Read more >
CNN: Flatten and Dense Layers (Shallow Neural Network)
About this Course This Deep Learning in TensorFlow Specialization is a foundational program that will help you understand the principles and ...
Read more >
Nengo_dl converter for keras Dense layer - Nengo forum
If I understand it correctly, the converter seems not be able to convert the Dense layer with multi-dimensional inputs, however due to the...
Read more >
tf.keras.layers.Dense | TensorFlow v2.11.0
Just your regular densely-connected NN layer.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found