Feature extraction in torchvision.models.vit_b_16
See original GitHub issue🐛 Describe the bug
Hi
It’s easy enough to obtain output features from the CNNs in torchvision.models by doing this:
import torch
import torch.nn as nn
import torchvision.models as models
model = models.resnet18()
feature_extractor = nn.Sequential(*list(model.children())[:-1])
output_features = feature_extractor(torch.randn(1, 3, 224, 224))
However, when I attempt to do this with torchvision.models.vit_b_16:
import torch
import torch.nn as nn
import torchvision.models as models
model = models.vit_b_16()
feature_extractor = nn.Sequential(*list(model.children())[:-1])
output_features = feature_extractor(torch.randn(1, 3, 224, 224))
I get the following error:
AssertionError: Expected (batch_size, seq_length, hidden_dim) got torch.Size([1, 768, 14, 14])
Any help would be greatly appreciated.
Versions
Torch version: 1.11.0+cu102 Torchvision version: 0.12.0+cu102
cc @datumbox
Issue Analytics
- State:
- Created a year ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Feature extraction for model inspection - PyTorch
The torchvision.models.feature_extraction package contains feature extraction utilities that let us tap into our models to access intermediate ...
Read more >How to Extract Features — VISSL 0.1.6 documentation
Given a pre-trained models, VISSL makes it easy to extract the features for the model on the datasets. VISSL seamlessly supports TorchVision models....
Read more >Vision Transformer (ViT) - Hugging Face
BEiT models outperform supervised pre-trained vision transformers using a self-supervised method inspired by BERT (masked image modeling) and based on a VQ-VAE.
Read more >Feature Extraction - Pytorch Image Models - GitHub Pages
Feature Extraction. All of the models in timm have consistent mechanisms for obtaining various types of features from the model for tasks besides ......
Read more >Feature Extraction With TorchVision's Newest Utility - YouTube
In this video I walk you through how to use Torchvision's new feature extraction utility. Questions welcome in the comments!
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@datumbox before working on FX I was trying to structure things that way in some timm models and found myself making weird modules that only made sense in a particular context. For instance, in the VIT example you would end up wrapping the final slice operation in a module… I also tried the approach of absorbing things into modules (like absorbing the slice operation into the encoder module). Sometimes this works, sometimes you find yourself changing the module name and docstring to try to make the semantics appropriate to the new inputs/outputs of the module. A lot of the time, the detour just doesn’t feel worth it.
If the FX solution didn’t exist, I’d probably be in agreement that that’s the way to go. But with FX, problem solved no?
@DavidTorpey thanks a lot for reporting!
@alexander-soare I wonder if you have the bandwidth to have a look?