question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shapes mismatch triggered at modeling_flax_utils

See original GitHub issue

Good day, while using the MiniDalle repo at: https://github.com/borisdayma/dalle-mini/issues/99

we are suddenly getting this error which was not happening before: “Trying to load the pretrained weight for (‘decoder’, ‘mid’, ‘attn_1’, ‘norm’, ‘bias’) failed: checkpoint has shape (1, 1, 1, 512) which is incompatible with the model shape (512,). Using ignore_mismatched_sizes=True if you really want to load this checkpoint inside this model.”

This is being triggered here: https://huggingface.co/transformers/_modules/transformers/modeling_flax_utils.html

in this area: # Mistmatched keys contains tuples key/shape1/shape2 of weights in the checkpoint that have a shape not # matching the weights in the model. mismatched_keys = [] for key in state.keys(): if key in random_state and state[key].shape != random_state[key].shape: if ignore_mismatched_sizes: mismatched_keys.append((key, state[key].shape, random_state[key].shape)) state[key] = random_state[key] else: raise ValueError( f"Trying to load the pretrained weight for {key} failed: checkpoint has shape " f"{state[key].shape} which is incompatible with the model shape {random_state[key].shape}. " "Using ignore_mismatched_sizes=True if you really want to load this checkpoint inside this " "model." )

There is a way to avoid halting the execution by going into the code and adding “ignore_mismatched_sizes=True” in the call. However, this does not fix the problem. If we do that, the execution continues but the results obtained by the minidalle model are wrong all washed out and with the wrong colors and contrast (which was not happening some days ago, so something has changed that is producing this problem). So this seems to be a bug coming from this file. Any tips are super welcome, thank you 😃

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:9
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
borisdaymacommented, Nov 3, 2021

It seems to work with flax==0.3.5. My guess is that weights are now being squeezed. Maybe need to reupload a new checkpoint? Actually here a shape of (512,) seems to make more sense than (1,1,1,512)

0reactions
patil-surajcommented, Nov 8, 2021

Left a comment here, https://github.com/borisdayma/dalle-mini/issues/99#issuecomment-963103973 Closing this issue, since it’s not related to transformers.

Read more comments on GitHub >

github_iconTop Results From Across the Web

shape mismatch: indexing arrays could not be broadcast ...
The reason I use slice() is because the actual index vector for table is code generated, so I can't use : , unfortunately...
Read more >
Stacked barchart, bottom parameter triggers Error: Shape ...
python - Stacked barchart, bottom parameter triggers Error: Shape mismatch: objects cannot be broadcast to a single shape - Data Science Stack ...
Read more >
Shape mismatch using theano shared variable - Questions
Hello, This is the first time I've tried to use theano shared variables in a model. I'm getting the following error and I'm...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found