Image interpolation method for downscaling (nearest neighbour, antialias)
See original GitHub issueI was testing different resizing algorithms and I noticed that the Nearest Neighbour algorithm is way faster than Antialiasing.
I was just wondering why ImageHash used Image.ANTIALIAS
over Image.NEAREST
for something that will be processed by the program (we don’t really care about how the image look if we have the features)
Issue Analytics
- State:
- Created 3 years ago
- Comments:12 (6 by maintainers)
Top Results From Across the Web
Image scaling - Wikipedia
One of the simpler ways of increasing image size is nearest-neighbor interpolation, replacing every pixel with the nearest pixel in the output; for...
Read more >Understanding Digital Image Interpolation
Bilinear interpolation considers the closest 2x2 neighborhood of known pixel values surrounding the unknown pixel. It then takes a weighted average of these...
Read more >A Fast Method for Scaling Color Images - EURASIP
The nearest neighbor method produces severe aliasing artifacts. The common downscaling methods include antialias filter and re-sampling. The downscaled data are.
Read more >The dangers behind image resizing - Zuru Tech
The naive approach is to round the coordinates to the nearest integers (nearest-neighbor interpolation). However, better results can be ...
Read more >Interpolation algorithms when downscaling - Stack Overflow
Edit: Lets assume we have a one dimensional image, with one colour channel per point. A downscale algorithm scaling 6 to 3 points...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Having messed around with image resizing in Python, I have a few comments:
Using
NEAREST
to generate the hash sounds like a bad idea.NEAREST
is very sensitive to minor image shifts, as it will just pick a pixel value more or less at random, depending on which pixel is used. So the hashes for similar but slightly shifted images will be very different, which is (IMHO) not the goal of the hashing. So a better algorithm is needed.A common approach to speed up image scaling in Python is to do it in two steps: scale down the image at 2x, 4x or 8x the target size using
NEAREST
and then do a final step usingANTIALIAS
. The result will be not exactly the same as a full antialias, but much faster and significantly closer than theNEAREST
.In addition to that PIL has a special method to downscale images quickly: thumbnail. Thumbnail is most effective if it is used before the image data is loaded, as it can tell the file loader to only load the needed data for the smaller image. This works very well for JPEG, less well for other formats.
I did run some tests (code at the end). I used the same test images as @Animenosekai, but I did not preload the data. I did warm the file system cache by loading all images once and discarding them.
Here is an example image and its 8x8 versions for comparison:
Original
Nearest
Anti-Alias
Nearest+AA
Thumbnail
In this view the differences between AA, N+AA and TH seem to be almost invisible, but the hashes do find differences. I tested it with
dhash
, and while nearest has an average distance of 16 (clearly unacceptable) to AA, both N+AA and TH have 3.65, which is much better but still noticeable. This probably really only matters in cases where there are old, existing hashes to compare to, for new projects and databases I wouldn’t expect to see a difference in detection rate for either of these algorithms.Interestingly I got fairly different performance numbers than @Animenosekai. In my tests without preloading data and with calculating the hash (from the scaled-down image) the differences between the algorithms were very small:
So, not very exciting. 😒
However, the test faces are only 64x64 pixels, which is very small. I tested it with some larger (~3000x4000) JPEG test images, and got more interesting results:
So
THUMBNAIL
is 3x faster thanAA
, and about 2x faster thanNEAREST
.So at this point I’m not sure what I would recommend. My use case is more like the second test: large images on disc. In that case
THUMBNAIL
makes a big difference, so I would love seeing it in imagehash.Just my $.02…
Test Code:
A subtlety is that some hashes do hashsize x hashsize, some do (hashsize + 1) x hashsize, but we can spell that out.