Shapes mismatch triggered at modeling_flax_utils
See original GitHub issueGood day, while using the MiniDalle repo at: https://github.com/borisdayma/dalle-mini/issues/99
we are suddenly getting this error which was not happening before:
“Trying to load the pretrained weight for (‘decoder’, ‘mid’, ‘attn_1’, ‘norm’, ‘bias’) failed: checkpoint has shape (1, 1, 1, 512) which is incompatible with the model shape (512,). Using ignore_mismatched_sizes=True
if you really want to load this checkpoint inside this model.”
This is being triggered here: https://huggingface.co/transformers/_modules/transformers/modeling_flax_utils.html
in this area:
# Mistmatched keys contains tuples key/shape1/shape2 of weights in the checkpoint that have a shape not # matching the weights in the model. mismatched_keys = [] for key in state.keys(): if key in random_state and state[key].shape != random_state[key].shape: if ignore_mismatched_sizes: mismatched_keys.append((key, state[key].shape, random_state[key].shape)) state[key] = random_state[key] else: raise ValueError( f"Trying to load the pretrained weight for {key} failed: checkpoint has shape " f"{state[key].shape} which is incompatible with the model shape {random_state[key].shape}. " "Using
ignore_mismatched_sizes=True if you really want to load this checkpoint inside this " "model." )
There is a way to avoid halting the execution by going into the code and adding “ignore_mismatched_sizes=True” in the call. However, this does not fix the problem. If we do that, the execution continues but the results obtained by the minidalle model are wrong all washed out and with the wrong colors and contrast (which was not happening some days ago, so something has changed that is producing this problem). So this seems to be a bug coming from this file. Any tips are super welcome, thank you 😃
Issue Analytics
- State:
- Created 2 years ago
- Reactions:9
- Comments:5 (4 by maintainers)
It seems to work with
flax==0.3.5
. My guess is that weights are now being squeezed. Maybe need to reupload a new checkpoint? Actually here a shape of (512,) seems to make more sense than (1,1,1,512)Left a comment here, https://github.com/borisdayma/dalle-mini/issues/99#issuecomment-963103973 Closing this issue, since it’s not related to
transformers
.