question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Kl Loss correction

See original GitHub issue

https://github.com/lucidrains/DALLE-pytorch/blob/995bfe1789243cbc838943cdc748daab406aae3e/dalle_pytorch/dalle_pytorch.py#L195

I am fairly certain that this should instead read

logits = rearrange(logits, 'b n h w -> (b h w) n')

since we are summing over the latent dimension, i.e. the probs/encoder outputs and averaging over the obversations. i.e. every spatial dim separately and for every sample/batch. The docs are a little messy on this but from what I understand batchmean requires a reshaping in the sense that all examples are condensed into the batch dimension

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:3
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
CDitzelcommented, Mar 20, 2021

I have no idea what going on with OpenAI, but so far I dont think they are open to transparent research as their name would suggest…

1reaction
CDitzelcommented, Mar 17, 2021

I have a headache Phil. The math demands this term to be there but when it is present, the results are actually worse. Hate it…

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Calculate the KL Divergence for Machine Learning
Nevertheless, we can calculate the KL divergence using the rel_entr() SciPy function and confirm that our manual calculation is correct.
Read more >
Kullback–Leibler divergence - Wikipedia
A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model when...
Read more >
Kullback-Leibler Divergence Explained - Count Bayesie
With KL divergence we can calculate exactly how much information is lost when we approximate one distribution with another.
Read more >
Kullback-Leibler Divergence - an overview - ScienceDirect.com
If the loss function yields more value, it means the decoder does not generate ... DKL is a positive quantity and is equal...
Read more >
What is the impact of scaling the KL divergence and ...
Variational autoencoders have two components in their loss function. The first component is the reconstruction ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found