Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Clarify transformations for image models at inference time

See original GitHub issue

Hello!

I think it might be helpful to clarify the transformations that images go through in the docs, and maybe provide a public method that encapsulates that. Here’s my current understanding.

1. Types

You can pass a few different types:

export type ClassifierInputSource = HTMLImageElement | HTMLCanvasElement | HTMLVideoElement | ImageBitmap;

2. cropTo

This image data is copied into a new canvas, cropped with cropTo. This sizes to 224x224, and uses a strategy like “cover”, resizing the image to be at least 224x224 and then cropping from the center.

3. capture

The call to capture grabs the pixels from the image, and then crops that tensor with cropTensor. This crop enforces that the image is square, but here it doesn’t do anything, since the image itself has already been cropped to be square in cropTo. Finally it normalizes the values in RGB space to [-1,1] here.

4.Transparency

It also seems like fully transparent pixels might be translated to rgb(0,0,0) as well. That happened in one example image I tried, but I didn’t look further.

Is that capturing it? These scaling, cropping and color changes seem like they would be important for callers (or users) to be aware of.

Exposing as a function

I think ideally this library would also expose any pre-processing for callers to use as well. That way tools using this can use the same pre-processing as well. Otherwise, if you made a tool that visualized the images the model predicts you might naively render the input image (which isn’t actually what the TM model sees), or analyze how the TM model compares to other models (without using the same pre-processing step). Concretely, one suggestion there would be to expose something like:

model.preprocess(image: ClassifierInputSource)

Returns a Tensor representing the image, after applying any transformations that the model applies to an input (eg, scaling, cropping or normalizing). The specifics of the particular transformations are part of the internals of the library and subject to breaking changes, but this public method will be stable.

Args:

image: an image, canvas, or video element to make a classification on

Usage:

const img = new Image();
img.src = '...'; // some image that is larger than 224x224px and not square and has some transparency
img.onload = async () => {
  const tensor = await model.preprocess(img);
  const canvas = document.createElement('canvas');
  await tf.browser.toPixels(tensor, canvas);
  document.body.appendChild(canvas);
}
document.body.appendChild(img);

original

pre-processed

(note also the color in the background, which I’m assuming was introduced by translating from [0,255] => [-1,1] => [0, 255] but didn’t look further) Screen Shot 2019-07-30 at 1 54 48 PM

example code

Thanks for sharing this awesome work 😄

Issue Analytics

State:
Created 4 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

irealvacommented, Aug 9, 2019

Thanks for sharing Kevin! We’ll look into this as soon a bit of work clears up. Some great thoughts and suggestions in here.

0reactions

roca77commented, Nov 22, 2020

@kevinrobinson Well, yes I am experiencing this issue. I build an image classifier but at inference I am not what sort of processing to apply to the input image. I know the images using in training are cropped to a square, but nothing about the dimension, scaling. I wish those information could be made public so I could implement them in my javascript code.

Top Results From Across the Web

Amazon SageMaker Clarify Bias Detection and Model ...

This topic describes how to configure an Amazon SageMaker Clarify processing job capable of computing bias metrics and feature attributions for explainability.

Transformations in Machine Learning - GetStream.io

Next, we will discuss convolutional neural networks (CNNs), which are state-of-the-art models for image classification, and long short-term ...

End-to-End Multiclass Image Classification Example

In this demo, we will use the Amazon sagemaker image classification algorithm to ... Batch Transform - Create a transform job to perform...

How to Use Test-Time Augmentation to Make Better Predictions

In this tutorial, you will discover test-time augmentation for improving the performance of models for image classification tasks.

Data Augmentation in Python: Everything You Need to Know

Transforms library is the augmentation part of the torchvision package that consists of popular datasets, model architectures, and common image ...