Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem training the diffusion prior

See original GitHub issue

While attempting to train the diffusion prior (with train_diffusion_prior.py), I run into the following exception:

Traceback (most recent call last):
  File "train_diffusion_prior.py", line 770, in <module>
    main()
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "train_diffusion_prior.py", line 766, in main
    initialize_training(config_file, accelerator)
  File "train_diffusion_prior.py", line 668, in initialize_training
    trainer: DiffusionPriorTrainer = make_model(
  File "train_diffusion_prior.py", line 48, in make_model
    diffusion_prior = prior_config.create()
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/dalle2_pytorch/train_configs.py", line 181, in create
    return DiffusionPrior(net = diffusion_prior_network, clip = clip, **kwargs)
  File "/home/ogezi/miniconda3/envs/playground/lib/python3.8/site-packages/dalle2_pytorch/dalle2_pytorch.py", line 1174, in __init__
    assert not exists(clip) or clip.dim_latent == self.image_embed_dim, f'you passed in a CLIP to the diffusion prior with latent dimensions of {clip.dim_latent}, but your image embedding dimension (keyword image_embed_dim) for the DiffusionPrior was set to {self.image_embed_dim}'
AssertionError: you passed in a CLIP to the diffusion prior with latent dimensions of 512, but your image embedding dimension (keyword image_embed_dim) for the DiffusionPrior was set to 768

I’ve also run the following snippet to check if any CLIP model has 768-dimensional latents:

import clip
from dalle2_pytorch import DiffusionPrior, DiffusionPriorNetwork, OpenAIClipAdapter
[(m, OpenAIClipAdapter(m).dim_latent) for m in clip.available_models()]

The result is:

[('RN50', 512), ('RN101', 512), ('RN50x4', 512), ('RN50x16', 512), ('RN50x64', 512), ('ViT-B/32', 512), ('ViT-B/16', 512), ('ViT-L/14', 512), ('ViT-L/14@336px', 512)]

So, it looks like the models available are all 512 dimensional. It’s important that my prior generates latents based on OpenAI CLIP. How do I get past this?

Versions: dalle2_pytorch: 1.10.6 clip: git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1

Issue Analytics

State:
Created a year ago
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

lucidrainscommented, Sep 29, 2022

@mikeogezi also, plugging @rom1504 's new open clip model!

from dalle2_pytorch import OpenClipAdapter

clip = OpenClipAdapter('ViT-H/14')

1reaction

lucidrainscommented, Sep 29, 2022

@mikeogezi Hi Michael! Thanks for surfacing this issue

Should be resolved at https://github.com/lucidrains/DALLE2-pytorch/commit/c18c0801283d30384912df0e35f225f3df1566a3

Top Results From Across the Web

Diffusion Priors In Variational Autoencoders - OpenReview

This paper proposes using a diffusion process as the prior for a variational autoencoder (VAE). Training is done via a lower bound on...

Solving 3D Inverse Problems using Pre-trained 2D Diffusion ...

In essence, we propose to augment the 2D diffusion prior with a model-based prior in the remaining direction at test time, such that...

Diffusion Prior for Online Decision Making: A Case Study of ...

In this work, we investigate the possibility of using denoising diffusion models to learn priors for online decision making problems.

Perception Prioritized Training of Diffusion Models

Diffusion models learn to restore noisy data, which is corrupted with different levels of noise, by optimizing the weighted sum of the corresponding...

Diffusion models as plug-and-play priors - DeepAI

In this paper, the prior is an independently trained denoising diffusion generative model. The auxiliary constraint is expected to have a ...