Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fine tuning with existing model

See original GitHub issue

Hi,

I tried to train a model with a custom dataset and the resnet101 backbone. I noticed that while half of the bounding boxes looked accurate, the masks were completely off. I checked drew the annotations and verified that they are correct.

It could be due to the size of the dataset: 1357 images and 21 classes. I would like to use yolact_im700_54_80000.pth and fine tune it with my custom classes to see if this improves my results. What would be the steps to do this?

Issue Analytics

State:
Created 4 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

9reactions

dbolyacommented, Sep 21, 2019

I don’t have explicit support for that, but you can probably get that to work by changing this line: https://github.com/dbolya/yolact/blob/a70b68dd70aac5a1f41789771a66fb33adba2809/yolact.py#L473 Replace that with

try:
    self.load_state_dict(state_dict)
except RuntimeError as e:
    print('Ignoring "' + str(e) + '"')

and then resume training from yolact_im700_54_80000.pth: python train.py --config=<your_config> --resume=weights/yolact_im700_54_800000.pth --start_iter=0

When there are size mismatches between tensors, Pytorch will spit out an error message but also keep on loading the rest of the tensors anyway. So here we just attempt to load a checkpoint with the wrong number of classes, eat the errors the Pytorch complains about, and then start training from iteration 0 with just those couple of tensors being untrained. You should see only the C (class) and S (semantic segmentation) losses reset.

You probably also want to modify the learning rate, decay schedule, and number of iterations in your config to account for fine-tuning.

3reactions

dbolyacommented, Feb 25, 2020

@hana9090

Those look reset to me? They should reset to a similar value as what they start out as, and C starts > 10 and goes down to < 5 so it looks like it reset.

And the loss key is in multibox_loss.py:

        # Loss Key:
        #  - B: Box Localization Loss
        #  - C: Class Confidence Loss
        #  - M: Mask Loss
        #  - P: Prototype Loss
        #  - D: Coefficient Diversity Loss
        #  - E: Class Existence Loss
        #  - S: Semantic Segmentation Loss

I is now “Mask IoU loss”