Question for implementation of resize in image-classification examples.
See original GitHub issueSystem Info
transformers
version: 4.21.0.dev0- Platform: Linux-4.15.0-175-generic-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): 1.12.0+cu113 (True)
- Tensorflow version (GPU?): 2.9.1 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
Who can help?
Examples:
maintained examples (not research project or legacy): @sgugger, @patil-suraj
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
- minimal code for reproduction.
## partial import from image classification scripts
from typing import Optional
from dataclasses import dataclass, field
from torchvision.transforms import (
CenterCrop,
Compose,
Resize,
)
from transformers import (
MODEL_FOR_IMAGE_CLASSIFICATION_MAPPING,
AutoConfig,
AutoFeatureExtractor,
)
MODEL_CONFIG_CLASSES = list(MODEL_FOR_IMAGE_CLASSIFICATION_MAPPING.keys())
MODEL_TYPES = tuple(conf.model_type for conf in MODEL_CONFIG_CLASSES)
@dataclass
class ModelArguments:
"""
Arguments pertaining to which model/config/tokenizer we are going to fine-tune from.
"""
model_name_or_path: str = field(
default="google/vit-base-patch16-224-in21k",
metadata={"help": "Path to pretrained model or model identifier from huggingface.co/models"},
)
model_type: Optional[str] = field(
default=None,
metadata={"help": "If training from scratch, pass a model type from the list: " + ", ".join(MODEL_TYPES)},
)
config_name: Optional[str] = field(
default=None, metadata={"help": "Pretrained config name or path if not the same as model_name"}
)
cache_dir: Optional[str] = field(
default=None, metadata={"help": "Where do you want to store the pretrained models downloaded from s3"}
)
model_revision: str = field(
default="main",
metadata={"help": "The specific model version to use (can be a branch name, tag name or commit id)."},
)
feature_extractor_name: str = field(default=None, metadata={"help": "Name or path of preprocessor config."})
use_auth_token: bool = field(
default=False,
metadata={
"help": (
"Will use the token generated when running `transformers-cli login` (necessary to use this script "
"with private models)."
)
},
)
ignore_mismatched_sizes: bool = field(
default=False,
metadata={"help": "Will enable to load a pretrained model whose head dimensions are different."},
)
# use defualt model_args
model_args = ModelArguments()
feature_extractor = AutoFeatureExtractor.from_pretrained(
model_args.feature_extractor_name or model_args.model_name_or_path,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision,
use_auth_token=True if model_args.use_auth_token else None,
)
# comment the ToTensor and normalize to check the PIL image.
_val_transforms = Compose(
[
Resize(feature_extractor.size),
CenterCrop(feature_extractor.size)
# ToTensor(),
# normalize,
]
)
- get sample image
from datasets import load_dataset
ds = load_dataset('imagenet-1k',use_auth_token=True, streaming=True)
im = list(ds['train'].take(1))[0]['image']
- original transform
original_transform = _val_transforms(im)
original_transform
- new transform
_val_transforms_new = Compose(
[
Resize((feature_extractor.size, feature_extractor.size)),
CenterCrop(feature_extractor.size)
# ToTensor(),
# normalize,
]
)
new_transform = _val_transforms_new(im)
new_transform
Expected behavior
I’m careful to say this because I’m a newbie in the field of vision, but the implementation for resize transformation in the _val_transforms
function seems to be wrong in image classification example script.(here and here)
This transform may cut the object in validation step.
...
_val_transforms = Compose(
[
Resize(feature_extractor.size),
CenterCrop(feature_extractor.size),
ToTensor(),
normalize,
]
)
...
In order to maintain the shape of the object and only change the size of the image, I think the following code is right for _val_transforms
function.
...
_val_transforms = Compose(
[
Resize((feature_extractor.size, feature_extractor.size)),
CenterCrop(feature_extractor.size),
ToTensor(),
normalize,
]
)
...
If I’ve misunderstood, please feel free to tell me about it.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Learning to Resize in Computer Vision - Keras
In this example, we will implement the learnable image resizing module as proposed in the paper and demonstrate that on the Cats and...
Read more >You Might Be Resizing Your Images Incorrectly - Roboflow Blog
Resizing images is a critical preprocessing step in computer vision. Principally, our machine learning models train faster on smaller images ...
Read more >How to deal with image resizing in Deep Learning - Medium
This post studies a similar problem: suppose each color channel has a different size. Which are the best ways to train an image...
Read more >Image Resizing using OpenCV | Python - GeeksforGeeks
Image resizing refers to the scaling of images. Scaling comes in handy in many image processing as well as machine learning applications.
Read more >Step-by-Step guide for Image Classification on Custom Datasets
Just like other transfer learning models, it is trained on 1000 categories. Every time we use it to classify a problem, we should...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @amyeroberts, thanks for explanation.
Now I understand what you intended.
I’m closing this issue. the issue has been resolved.
Hi @DataLama, thanks for raising the issue.
In this script, the reason for the validation transformations being defined like this and in this order - resize then centre crop - is that we end up with an image of size
(feature_extractor.size, feature_extractor.size)
, but what’s shown in the image has the same aspect ratio as the original i.e. the image isn’t “squashed”.In your suggestion:
the image would be resized to
(feature_extractor.size, feature_extractor.size)
first, changing the aspect ratio, andCenterCrop(feature_extractor.size)
would then not have an effect.