Model Weights Saved in 0.5.3 may be incompatible with 0.6.0
See original GitHub issueDescribe the bug It seems that models trained and saved in monai 0.5.3 cannot be restored and used in 0.6.0. There is mis-match between the keys in different layers.
To Reproduce Clone the UNETR repository from (https://github.com/Project-MONAI/research-contributions/tree/master/UNETR/BTCV). Train a model in 0.5.3 and save the weights. Load the model in 0.6.0 using: ckpt = torch.load(args.ckpt) model.load_state_dict(ckpt[‘state_dict’])
Expected behavior Monai 0.6.0 version throws this error for mis-matched keys in loading the state_dict for the convolution layers of the network. For example, the following is a representative error:
RuntimeError: Error(s) in loading state_dict for UNETR:
Unexpected key(s) in state_dict: "encoder1.layer.norm1.weight", "encoder1.layer.norm1.bias", "encoder1.layer.norm2.weight", "encoder1.layer.norm2.bias", "encoder1.layer.norm3.weight", "encoder1.layer.norm3.bias", "encoder10.layer.norm1.weight", "encoder10.layer.norm1.bias", "encoder10.layer.norm2.weight", "encoder10.layer.norm2.bias", "encoder10.layer.norm3.weight", "encoder10.layer.norm3.bias", "decoder5.conv_block.norm1.weight", "decoder5.conv_block.norm1.bias", "decoder5.conv_block.norm2.weight", "decoder5.conv_block.norm2.bias", "decoder5.conv_block.norm3.weight", "decoder5.conv_block.norm3.bias", "decoder4.conv_block.norm1.weight", "decoder4.conv_block.norm1.bias", "decoder4.conv_block.norm2.weight", "decoder4.conv_block.norm2.bias", "decoder4.conv_block.norm3.weight", "decoder4.conv_block.norm3.bias", "decoder3.conv_block.norm1.weight", "decoder3.conv_block.norm1.bias", "decoder3.conv_block.norm2.weight", "decoder3.conv_block.norm2.bias", "decoder3.conv_block.norm3.weight", "decoder3.conv_block.norm3.bias", "decoder2.conv_block.norm1.weight", "decoder2.conv_block.norm1.bias", "decoder2.conv_block.norm2.weight", "decoder2.conv_block.norm2.bias", "decoder2.conv_block.norm3.weight", "decoder2.conv_block.norm3.bias", "decoder1.conv_block.norm1.weight", "decoder1.conv_block.norm1.bias", "decoder1.conv_block.norm2.weight", "decoder1.conv_block.norm2.bias", "decoder1.conv_block.norm3.weight", "decoder1.conv_block.norm3.bias".
These are for the convolutional layers of the network In 0.5.3 version, you can simply check out for example model.decoder1.conv_block.norm3.bias and get the values ( even without loading the state_dict)
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
requires_grad=True)
But in 0.6.0 , model.decoder1.conv_block.norm3.bias returns None.
If this is not fixed, it breaks the compatibility between 0.5.3 and 0.6.0 for restoring models.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
I think this ticket makes a good point, @yiheng-wang-nv perhaps we can update the migration guide to make it clear? as another workaround, is it possible to use
copy_model_state
to load v0.5.3 checkpoints? https://github.com/Project-MONAI/MONAI/blob/e95500ad1802e91da12f31350ba8e3b85dc36d11/monai/networks/utils.py#L339In general, monai adopts the semantic versioning – at this stage we are at major version zero (
0.y.z
), indicating an initial development. Anything may change at any time, the public API should not be considered stable.Thanks @wyli , I’ve updated the guide.