Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The `test_detection_model_trainable_backbone_layers` test shouldn't download the pretrained_backbone weights

See original GitHub issue

Feature Improvement

The test_detection_model_trainable_backbone_layers test currently downloads the weights of the backbone: https://github.com/pytorch/vision/blob/6530546080148cd5d9533b50a8258f12cb050c82/test/test_models.py#L786

Setting the value pretrained_backbone=True is necessary because the number of trainable layers depends on this value. Unfortunately downloading pre-trained weights can lead to flakiness and slow tests and should be avoided. Until we setup a cache to store the weights locally on the CI, we should find a way to skip the actual downloading of weights during the test execution.

cc @datumbox @pmeier

Issue Analytics

State:
Created 2 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

NicolasHugcommented, Oct 20, 2021

I think it should be possible to patch load_state_dict at the nn.Module level as you suggested, something like:

@pytest.mark.parametrize("model_name", get_available_detection_models())
def test_mock_method(model_name, mocker):

    mocker.patch('torch.nn.Module.load_state_dict')
    mocker.patch('torch.hub.load_state_dict_from_url')
 
    model = torchvision.models.detection.__dict__[model_name](
        pretrained=False, pretrained_backbone=True, trainable_backbone_layers=4,
    )

I haven’t checked, but this shouldn’t do any network call – this can be verified with pytest-sockets https://github.com/miketheman/pytest-socket

1reaction

datumboxcommented, Oct 20, 2021

@NicolasHug thanks for checking. Yes the weights are always added in the manifold so this won’t be a problem.

The speed is not as bad on CircleCI. We checked during merge (see here) and only 1 test appeared on the top slowest with execution time 14-15 secs. All others executed in less than 8 sec:

==== ========================= slowest 20 durations =============================
54.32s call     test/test_models.py::test_quantized_classification_model[mobilenet_v3_large]
39.12s call     test/test_models.py::test_quantized_classification_model[resnext101_32x8d]
38.79s call     test/test_models.py::test_quantized_classification_model[mobilenet_v2]
31.24s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x0_5]
18.17s call     test/test_models.py::test_quantized_classification_model[googlenet]
15.27s call     test/test_models.py::test_quantized_classification_model[resnet50]
14.98s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x1_0]
14.80s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x2_0]
14.48s call     test/test_models.py::test_detection_model_trainable_backbone_layers[ssd300_vgg16]
14.10s call     test/test_models.py::test_quantized_classification_model[shufflenet_v2_x1_5]
13.88s call     test/test_datasets.py::LFWPairsTestCase::test_transforms
13.29s call     test/test_models.py::test_detection_model[cpu-fasterrcnn_mobilenet_v3_large_fpn]
12.53s call     test/test_models.py::test_classification_model[cpu-densenet201]
12.14s call     test/test_models.py::test_classification_model[cpu-densenet161]
11.28s call     test/test_models.py::test_classification_model[cpu-efficientnet_b7]
10.74s call     test/test_models.py::test_classification_model[cpu-densenet169]
10.66s call     test/test_backbone_utils.py::TestFxFeatureExtraction::test_jit_forward_backward[efficientnet_b7]
10.14s call     test/test_models.py::test_classification_model[cpu-regnet_y_32gf]
9.10s call     test/test_models.py::test_classification_model[cpu-efficientnet_b6]
8.80s call     test/test_datasets.py::LFWPeopleTestCase::test_transforms
=========================== short test summary info ============================

The problem can be solved quite easily using the new weights API, because you can easily patch the model loading method of the weights during tests and avoid their actual downloading. That’s a bit harder using the old pretrained approach. Any thoughts/ideas about this? I’m open to reverting if a good solution doesn’t exist now and we could add the test back once we’ve moved the multi-pretrained model mechanism in main.

EDIT:

I checked the new proposed API and patching it won’t be as easy as I remembered. I think we need to make some additional minor adjustments. Here is how the weights are typically loaded based on the current proposal:

if weights is not None:
    model.load_state_dict(weights.state_dict(progress=progress))

One could easily patch the state_dict method of the weights in the tests, so that they don’t actually download anything. Unfortunately, passing None on load_state_dict() wont work. We could have an extra step that checks that the weights are not None before loading them but this might require some extra thought.

Top Results From Across the Web

Initialized DETR backbone weights do not match with ... - GitHub

Hello everybody! My task is to initialize DETR Object Detection model with my own pretrained backbone (for example, ResNet-50).

06. PyTorch Transfer Learning - Zero to Mastery Learn ...

We can setup the EfficientNet_B0 pretrained ImageNet weights using the same code as we used to create the transforms. weights = torchvision.models.

Training with Custom Pretrained Models Using the NVIDIA ...

This post walks you through the workflow, from downloading the TLT Docker container and AI models from NVIDIA NGC, to training and ...

Transfer learning and fine-tuning | TensorFlow Core

Load in the pretrained base model (and pretrained weights) ... As the original dataset doesn't contain a test set, you will create one....

Transfer Learning in PyTorch, Part 2: How to Create a Transfer ...

In the last case, we may want to do only predictions and keep all weights including the backbone and the head frozen. This...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

The `test_detection_model_trainable_backbone_layers` test shouldn't download the pretrained_backbone weights

Feature Improvement

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Drop torchvision 0.2.2 from conda-forge as it's often a silent indicator of major package conflicts

Multi pretrained weights: Cleanups and Refactoring