question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature extraction in torchvision.models.vit_b_16

See original GitHub issue

🐛 Describe the bug

Hi

It’s easy enough to obtain output features from the CNNs in torchvision.models by doing this:

import torch
import torch.nn as nn
import torchvision.models as models

model = models.resnet18()
feature_extractor = nn.Sequential(*list(model.children())[:-1])
output_features = feature_extractor(torch.randn(1, 3, 224, 224))

However, when I attempt to do this with torchvision.models.vit_b_16:

import torch
import torch.nn as nn
import torchvision.models as models

model = models.vit_b_16()
feature_extractor = nn.Sequential(*list(model.children())[:-1])
output_features = feature_extractor(torch.randn(1, 3, 224, 224))

I get the following error:

AssertionError: Expected (batch_size, seq_length, hidden_dim) got torch.Size([1, 768, 14, 14])

Any help would be greatly appreciated.

Versions

Torch version: 1.11.0+cu102 Torchvision version: 0.12.0+cu102

cc @datumbox

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
alexander-soarecommented, Apr 1, 2022

@datumbox before working on FX I was trying to structure things that way in some timm models and found myself making weird modules that only made sense in a particular context. For instance, in the VIT example you would end up wrapping the final slice operation in a module… I also tried the approach of absorbing things into modules (like absorbing the slice operation into the encoder module). Sometimes this works, sometimes you find yourself changing the module name and docstring to try to make the semantics appropriate to the new inputs/outputs of the module. A lot of the time, the detour just doesn’t feel worth it.

If the FX solution didn’t exist, I’d probably be in agreement that that’s the way to go. But with FX, problem solved no?

1reaction
datumboxcommented, Apr 1, 2022

@DavidTorpey thanks a lot for reporting!

@alexander-soare I wonder if you have the bandwidth to have a look?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Feature extraction for model inspection - PyTorch
The torchvision.models.feature_extraction package contains feature extraction utilities that let us tap into our models to access intermediate ...
Read more >
How to Extract Features — VISSL 0.1.6 documentation
Given a pre-trained models, VISSL makes it easy to extract the features for the model on the datasets. VISSL seamlessly supports TorchVision models....
Read more >
Vision Transformer (ViT) - Hugging Face
BEiT models outperform supervised pre-trained vision transformers using a self-supervised method inspired by BERT (masked image modeling) and based on a VQ-VAE.
Read more >
Feature Extraction - Pytorch Image Models - GitHub Pages
Feature Extraction. All of the models in timm have consistent mechanisms for obtaining various types of features from the model for tasks besides ......
Read more >
Feature Extraction With TorchVision's Newest Utility - YouTube
In this video I walk you through how to use Torchvision's new feature extraction utility. Questions welcome in the comments!
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found