question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

failed: checkpoint has shape (1, 1, 1, 512) which is incompatible with the model shape (512,)

See original GitHub issue

When trying to load the VQModel using from_pretrained, it fails and generates an error message.

# make sure we use compatible versions
VQGAN_REPO = 'flax-community/vqgan_f16_16384'
VQGAN_COMMIT_ID = '90cc46addd2dd8f5be21586a9a23e1b95aa506a9'

# set up VQGAN
vqgan = VQModel.from_pretrained(VQGAN_REPO, revision=VQGAN_COMMIT_ID)
DEBUG:filelock:Attempting to acquire lock 139630901715792 on /root/.cache/huggingface/transformers/9d51ab91692e9c42f82e628f71bc27d13685dba2b0b28841dd1fb163e861cb4f.de091ef3cdb74c7d4cc2da0510c99e9ae385befed9ae4473a3191b6d93da9edd.lock
DEBUG:filelock:Lock 139630901715792 acquired on /root/.cache/huggingface/transformers/9d51ab91692e9c42f82e628f71bc27d13685dba2b0b28841dd1fb163e861cb4f.de091ef3cdb74c7d4cc2da0510c99e9ae385befed9ae4473a3191b6d93da9edd.lock
Downloading: 100%
433/433 [00:00<00:00, 16.0kB/s]
DEBUG:filelock:Attempting to release lock 139630901715792 on /root/.cache/huggingface/transformers/9d51ab91692e9c42f82e628f71bc27d13685dba2b0b28841dd1fb163e861cb4f.de091ef3cdb74c7d4cc2da0510c99e9ae385befed9ae4473a3191b6d93da9edd.lock
DEBUG:filelock:Lock 139630901715792 released on /root/.cache/huggingface/transformers/9d51ab91692e9c42f82e628f71bc27d13685dba2b0b28841dd1fb163e861cb4f.de091ef3cdb74c7d4cc2da0510c99e9ae385befed9ae4473a3191b6d93da9edd.lock
DEBUG:filelock:Attempting to acquire lock 139630891372880 on /root/.cache/huggingface/transformers/98190fe7878f67d122ca4539eb6b459bc77dd757fe54e8cd774952c7f32bab79.8efdbd1ba9de17901e4252a40a48003335296da3f7584ca3cac46bff5d9d142b.lock
DEBUG:filelock:Lock 139630891372880 acquired on /root/.cache/huggingface/transformers/98190fe7878f67d122ca4539eb6b459bc77dd757fe54e8cd774952c7f32bab79.8efdbd1ba9de17901e4252a40a48003335296da3f7584ca3cac46bff5d9d142b.lock
Downloading: 100%
290M/290M [00:05<00:00, 58.7MB/s]
DEBUG:filelock:Attempting to release lock 139630891372880 on /root/.cache/huggingface/transformers/98190fe7878f67d122ca4539eb6b459bc77dd757fe54e8cd774952c7f32bab79.8efdbd1ba9de17901e4252a40a48003335296da3f7584ca3cac46bff5d9d142b.lock
DEBUG:filelock:Lock 139630891372880 released on /root/.cache/huggingface/transformers/98190fe7878f67d122ca4539eb6b459bc77dd757fe54e8cd774952c7f32bab79.8efdbd1ba9de17901e4252a40a48003335296da3f7584ca3cac46bff5d9d142b.lock
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-00923b8dba5a> in <module>()
      1 # set up VQGAN
----> 2 vqgan = VQModel.from_pretrained(VQGAN_REPO, revision=VQGAN_COMMIT_ID)

/usr/local/lib/python3.7/dist-packages/transformers/modeling_flax_utils.py in from_pretrained(cls, pretrained_model_name_or_path, dtype, *model_args, **kwargs)
    402                 else:
    403                     raise ValueError(
--> 404                         f"Trying to load the pretrained weight for {key} failed: checkpoint has shape "
    405                         f"{state[key].shape} which is incompatible with the model shape {random_state[key].shape}. "
    406                         "Using `ignore_mismatched_sizes=True` if you really want to load this checkpoint inside this "

ValueError: Trying to load the pretrained weight for ('decoder', 'mid', 'attn_1', 'norm', 'bias') failed: checkpoint has shape (1, 1, 1, 512) which is incompatible with the model shape (512,). Using `ignore_mismatched_sizes=True` if you really want to load this checkpoint inside this model.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
javismilescommented, Oct 30, 2021

Good day, we detected this same issue/ bug today. All was working great until you did your last update a few days ago. There is a way to avoid halting the execution by going into the code and adding “ignore_mismatched_sizes=True” in the call. However, this does not fix the problem. If we do that, the execution continues but the results obtained by the model are terrible, all washed out and with the wrong colors and contrast. We definitely need a fix for this bug, thank you very much 😃 Till you latest changes the model was working great, obtaining beautiful results. We hope that you can fix this soon, thank you

1reaction
borisdaymacommented, Nov 30, 2021

This has been fixed, you can use dalle-mini/vqgan_imagenet_f16_16384 which is the updated checkpoint.

It’s different from the original we use in the inference notebook because that one was fine-tuned on other images. We will update the inference notebook to use this new checkpoint once we have a new model compatible with it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Shapes mismatch triggered at modeling_flax_utils #14215
we are suddenly getting this error which was not happening before: ... failed: checkpoint has shape (1, 1, 1, 512) which is incompatible...
Read more >
How to solve "Variable is available in checkpoint, but has an ...
Checkpoint shape : [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint....
Read more >
Unsupported value type BatchEncoding - Hugging Face Forums
So far then, very much like the boiler-plate code in the course. My encoded training data looks like this:- {'input_ids': <tf.Tensor: shape=(1040, 512),...
Read more >
model.fit ValueError: Shapes (None, 22) and (None, 10) are ...
ValueError: logits and labels must have the same shape ((None, 1) vs (None, 2)) ... ValueError: Shapes (None, 512) and (None, 512, 12)...
Read more >
RuntimeError: Error(s) in loading state_dict for DataParallel
Hi, I have reimplemented the GAN for grayscale radiology data ... Size([1, 512, 1, 1]) from checkpoint, the shape in current model is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found