Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model Weights Saved in 0.5.3 may be incompatible with 0.6.0

See original GitHub issue

Describe the bug It seems that models trained and saved in monai 0.5.3 cannot be restored and used in 0.6.0. There is mis-match between the keys in different layers.

To Reproduce Clone the UNETR repository from (https://github.com/Project-MONAI/research-contributions/tree/master/UNETR/BTCV). Train a model in 0.5.3 and save the weights. Load the model in 0.6.0 using: ckpt = torch.load(args.ckpt) model.load_state_dict(ckpt[‘state_dict’])

Expected behavior Monai 0.6.0 version throws this error for mis-matched keys in loading the state_dict for the convolution layers of the network. For example, the following is a representative error:

RuntimeError: Error(s) in loading state_dict for UNETR:
        Unexpected key(s) in state_dict: "encoder1.layer.norm1.weight", "encoder1.layer.norm1.bias", "encoder1.layer.norm2.weight", "encoder1.layer.norm2.bias", "encoder1.layer.norm3.weight", "encoder1.layer.norm3.bias", "encoder10.layer.norm1.weight", "encoder10.layer.norm1.bias", "encoder10.layer.norm2.weight", "encoder10.layer.norm2.bias", "encoder10.layer.norm3.weight", "encoder10.layer.norm3.bias", "decoder5.conv_block.norm1.weight", "decoder5.conv_block.norm1.bias", "decoder5.conv_block.norm2.weight", "decoder5.conv_block.norm2.bias", "decoder5.conv_block.norm3.weight", "decoder5.conv_block.norm3.bias", "decoder4.conv_block.norm1.weight", "decoder4.conv_block.norm1.bias", "decoder4.conv_block.norm2.weight", "decoder4.conv_block.norm2.bias", "decoder4.conv_block.norm3.weight", "decoder4.conv_block.norm3.bias", "decoder3.conv_block.norm1.weight", "decoder3.conv_block.norm1.bias", "decoder3.conv_block.norm2.weight", "decoder3.conv_block.norm2.bias", "decoder3.conv_block.norm3.weight", "decoder3.conv_block.norm3.bias", "decoder2.conv_block.norm1.weight", "decoder2.conv_block.norm1.bias", "decoder2.conv_block.norm2.weight", "decoder2.conv_block.norm2.bias", "decoder2.conv_block.norm3.weight", "decoder2.conv_block.norm3.bias", "decoder1.conv_block.norm1.weight", "decoder1.conv_block.norm1.bias", "decoder1.conv_block.norm2.weight", "decoder1.conv_block.norm2.bias", "decoder1.conv_block.norm3.weight", "decoder1.conv_block.norm3.bias".

These are for the convolutional layers of the network In 0.5.3 version, you can simply check out for example model.decoder1.conv_block.norm3.bias and get the values ( even without loading the state_dict)

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       requires_grad=True)

But in 0.6.0 , model.decoder1.conv_block.norm3.bias returns None.

If this is not fixed, it breaks the compatibility between 0.5.3 and 0.6.0 for restoring models.

Issue Analytics

State:
Created 2 years ago
Comments:8 (8 by maintainers)

Top GitHub Comments

1reaction

wylicommented, Aug 24, 2021

I think this ticket makes a good point, @yiheng-wang-nv perhaps we can update the migration guide to make it clear? as another workaround, is it possible to use copy_model_state to load v0.5.3 checkpoints? https://github.com/Project-MONAI/MONAI/blob/e95500ad1802e91da12f31350ba8e3b85dc36d11/monai/networks/utils.py#L339

In general, monai adopts the semantic versioning – at this stage we are at major version zero (0.y.z), indicating an initial development. Anything may change at any time, the public API should not be considered stable.

0reactions

yiheng-wang-nvcommented, Aug 26, 2021